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pore complex (NPC). Newly resolved 
components include the symmetrical core 
(orange) and cytoplasmic filaments (yel- 
low). The NPC, one of the largest assemblies 
in eukaryotic cells, is the bidirectional 
gateway for macromolecule transport. 
Researchers have combined several analyti- 
cal techniques to determine the composite 
structure of the human NPC at near-atomic 
resolution. Additionally, 
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Portion of 1640 clinical trials rated as “bad,” 
defined as those at high risk of bias because of 
selective reporting of results and other flaws. 


The bad ones wasted as much as &8 billion. (Trials) 


Edited by Jeffrey Brainard 


Pigs are housed at a farm on the outskirts of Hanoi as Vietnam works to counter swine disease. 


Vaccine targets African swine fever 


ietnam’s agriculture ministry last week gave limited authori- 

zation to a vaccine hailed as an important tool to control one 

of the most serious animal diseases, African swine fever (ASF). 

In recent years the sickness has hit pig herds hard in several 

Asian and European countries. Vietnam’s National Veterinary 

Joint Stock Company developed the vaccine based on an ASF 
virus strain engineered by the U.S. Agricultural Research Service to 
lack a gene linked to virulence. A small trial of 20 animals, reported in 
September 2021, found strong evidence of protection; the company says 
an unpublished, follow-up trial of 131 pigs showed 99% of those that 
got full doses survived ASF infections. Based on these results, the min- 
istry approved commercial use of the vaccine in up to 600,000 pigs. It 
will evaluate results before deciding whether to allow nationwide use. 
Endemic in Africa, ASF spread through much of Europe in the 2000s 
and to Asia in 2018, requiring culling and creating shortages of pork, a 
major source of protein throughout the region. 
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FDA panel backs Novavax vaccine 


covip-19 | The U.S. Food and Drug 
Administration’s (FDA’s) vaccine advisory 
panel this week recommended nearly 
unanimously that the agency authorize 

a protein-based COVID-19 vaccine from 
Novavax, which would be the first of its 
kind available to U.S. adults. Panel mem- 
bers said benefits of the vaccine, made of 
the SARS-CoV-2 spike protein combined 
with an immune-boosting substance, 
outweighed risks when it is given in two 
doses 3 weeks apart to those 18 years and 
older. FDA doesn’t have to abide by its 
advisers’ recommendations but usually 
does. In a 30,000-person trial in the United 
States and Mexico, the vaccine was 90.4% 
efficacious at preventing symptomatic 
infection by early strains of SARS-CoV-2. 
The approval came days after FDA posted 
data documenting five cases of myocar- 
ditis or pericarditis—inflammations of 
heart tissue—in volunteers, most of them 
young men, soon after they received the 
vaccine in U.S. and U.K. clinical trials. 
Novavax hopes its product will attract U.S. 
recipients skeptical of vaccines that employ 
messenger RNA and booster seekers who 
favor its proven method, which has led to 
licensed vaccines for other diseases, such 
as shingles. 


Studies of low-dose radiation urged 


BIOMEDICINE | The U.S. government 
should spend $100 million per year for at 
least 15 years to study the health effects of 
low-dose radiation, a high-profile review 
panel concluded last week. The public and 
workers are routinely exposed to low-dose 
radiation (below 100 milligray, a measure- 
ment of absorbed dose) from sources such 
as medical scans, air travel, and mining, 
which contributes to cancer and possibly 
heart disease and other health problems. 
The Department of Energy’s (DOE’s) Office 
of Science ended a long-running program 
to study low-dose radiation in 2016 so 

it could focus on other priorities. But in 
2018, Congress mandated its revival and 
later asked the National Academies of 
Sciences, Engineering, and Medicine for a 
new blueprint. The research is important 
and should resume, although not entirely 
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under DOE’s sponsorship, the academies’ 
report says, noting the conflicts of interest 
associated with its nuclear weapons facili- 
ties. The report recommends the National 
Institutes of Health fund epidemiological 
and biological studies; DOE should over- 
see computational work and modeling. 
Congress must now decide whether to 
appropriate the funding. 


Bees gets protected as ‘fish’ 


CONSERVATION | Four species of bumble 
bees qualify for protection under California’s 
Endangered Species Act because they fit a 
loophole in the state’s definition of “fish, an 
appeals court ruled last week. Until now, the 
state law protected no insect species. But a 
state Court of Appeal based in Sacramento 
pointed to California’s Fish and Game Code, 
which includes in the definition of fish any 
“mollusk, crustacean, invertebrate, (or) 
amphibian.” That wording covers any terres- 
trial invertebrate, such as a bumble bee, the 
court wrote. The ruling was celebrated by 
conservation organizations and bemoaned 
by agricultural groups, which argued that 
extending the protection to the bumble 

bees would burden farming operations. Bee 
populations have declined across the United 
States and elsewhere, posing threats to agri- 
cultural crops and other plants that depend 
on pollinators for healthy development. 


NIH grantees lax on foreign detail 


RESEARCH SECURITY | A U.S. government 
watchdog has found that many institu- 
tions receiving funding from the National 
Institutes of Health (NIH) don’t follow 
federal rules on reporting foreign sources 
of support, educating scientists about those 
rules, and investigating possible conflicts 

of interest. A 22 June report by the inspec- 
tor general of NIH’s parent department 
found, for example, that 36% of the more 
than 600 institutions surveyed in late 2020 
don’t require their faculty members to 
disclose participation in another country’s 
talent recruitment program and 37% don’t 
distinguish between domestic and foreign 
funding. Since 2018, NIH has been especially 
vigilant in tracking grantees’ links to China 
as part of a governmentwide campaign to 
prevent the theft of U.S.-funded research 

by that country. The report calls on NIH to 
enforce the existing rules, which institutions 
must obey as a condition of funding. 


Rice led to chicken domestication 


EVOLUTION | People around the world 
know chicken and rice is a winning 
culinary combination. But now, scientists 
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say without rice, there might not have 
been chickens. It wasn’t until humans 
began clearing forest and sowing rice 
seeds within the range of red jungle fowl 
in Southeast Asia that some of these 
wild pheasants swept down from the 
trees to feed on the seeds—and evolved 
into more docile chickens, according 

to research published this week in the 
Proceedings of the National Academy of 
Sciences. This taming of the jungle fowl 


happened much more recently than other 
studies have estimated, according to the 
comprehensive analysis of bones and 
dates at more than 600 sites, which found 
that some bones thought to be chickens 
belonged to other animals. The authors 
say the oldest chickens appear just 

3250 to 3650 years ago at a rice farming 
site in what is now central Thailand. Then, 
chickens spread across Asia with rice and 
millet farming. 


IN FOCUS Egypt's antiquities ministry last week unveiled a new collection of artifacts 
from its Late Period (about 664 B.C.E. to 332 B.C.E.) found within the Saqqara 
necropolis, near Cairo. The new discoveries from the previously excavated cemetery 
include 150 bronze statues of ancient Egyptian deities and 250 wooden sarcophagi. 
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NEWS 


INFECTIOUS DISEASE 


Monkeypox vaccination plans 
take shape amid questions 


Favored shot is a seemingly safer smallpox vaccine, but 
researchers debate how best to use it 


By Kai Kupferschmidt 


n 1959, German microbiologist Anton 

Mayr took a strain of vaccinia, a poxvi- 

rus used to inoculate against smallpox, 

and started to grow it in cells taken 

from chicken embryos. After several 

years of transferring the strain to fresh 
cells every few days, the virus had changed 
so much it could no longer reproduce in 
most cells from mammals. But it could still 
produce an immune response that pro- 
tected against smallpox. 

Mayr had set out to study how poxviruses 
evolve, but by accident he had produced a 
potentially safer smallpox vaccine. Dubbed 
Modified Vaccinia Ankara (MVA) because 
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the original viral strain came from that 
Turkish city, the vaccine had a short career. 
“With smallpox eradicated in 1980, it disap- 
peared into the freezer,’ says Gerd Sutter, a 
virologist at the Ludwig Maximilian Univer- 
sity of Munich, who has studied Mayr’s vac- 
cinia strain for decades. 

Now, this virus, further weakened and 
brought to the market by the Danish 
pharma company Bavarian Nordic, may 
become key to arresting the largest out- 
break of monkeypox ever seen outside Af- 
rica, which has already sickened more than 
1000 people. It is the only vaccine licensed 
anywhere for use against monkeypox, al- 
though other, riskier smallpox vaccines 
also appear to work. The United States, the 


Canada has begun to offer a monkeypox vaccine 
made by Bavarian Nordic to select groups, including 
contacts of known cases of the disease. 


United Kingdom, Canada, and several other 
countries have already started to “ring” vac- 
cinate, offering it to contacts of identified 
monkeypox cases, including health care 
workers and sexual partners. “MVA will be 
very important in this outbreak because it 
is a nonreplicating vaccine, which means 
it doesn’t have the same side effect profile 
as some of the other live [virus] vaccines” 
being considered, says Rosamund Lewis, 
technical lead on monkeypox at the World 
Health Organization (WHO). 

But what role the vaccine will ultimately 
play depends on a host of factors: whether 
those most at risk from infection can be iden- 
tified and vaccinated, whether the vaccine is 
as effective as hoped, and whether enough 
is available to stop the burgeoning outbreak. 
WHO has so far only backed ring vaccina- 
tion—MVA is ideally given within 4 days 
of an exposure but recommended for up to 
14 days—but some scientists say it’s too dif- 
ficult to reach the specific contacts people 
had. They advocate broader vaccination 
campaigns in the population most affected 
so far: men who have sex with men (MSM). 

Hundreds of millions of doses of small- 
pox vaccine are stored around the world, 
insurance against a possible release of the 
dreaded virus by terrorists or in war, and 
they are known to offer some protection 
against monkeypox. A study in the Demo- 
cratic Republic of the Congo (DRC) in the 
1980s found that household contacts of 
people sick with monkeypox were seven 
times less likely to contract the disease if 
they had been vaccinated against smallpox. 
Yet the vast majority of existing smallpox 
vaccines consist still replicating vaccinia. 
These can cause rare but life-threatening 
side effects such as a encephalitis or pro- 
gressive vaccinia, the spread of the vaccine 
virus to the whole body, to which immuno- 
compromised people are vulnerable. 

Although 66 people have already died of 
monkeypox this year in African countries, 
the recent cases in nonendemic countries 
have mostly been mild. And many con- 
tacts of those infected are living with HIV, 
which could make them more likely to suf- 
fer from vaccinia side effects. Given the 
risks and benefits, “using these vaccines is 
out of the question,” Sutter says. 

Bavarian Nordic’s nonreplicating vac- 
cine, marketed as Jynneos in the United 
States and as Imvanex in Europe, sidesteps 
some of the risk. So does a vaccinia-based 
vaccine named LC16m8, licensed for small- 
pox only in Japan, which also appears to 
cause fewer side effects. “I believe these 
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are the ones that are going to be used 
[in the new outbreak] because they have 
a much-enhanced safety profile,” says 
Marion Gruber, who headed the U.S. Food 
and Drug Administration’s vaccine office 
until October 2021. 

Canada and the United States have 
already licensed MVA for use against 
monkeypox and Bavarian Nordic is in 
talks with the European Medicines Agency 
(EMA). “I really hope that in a matter of 1 or 
2 months from now, this can be approved” 
in Europe, says EMA’s Marco Cavaleri. 

The United Kingdom has been using 
MVA “off-label” for a few years to vaccinate 
contacts of imported monkeypox cases. 
WHO ’s Strategic Advisory Group of Experts 
on Immunization is set to release guidance 
in the next days that will back MVA, but it 
will also recommend using earlier vaccines 
in certain scenarios. Still, 
Cavaleri says, “If [MVA] is 
available, clearly that will 
be the vaccine to start.” 

Exactly how much _ is 
available remains murky. 
“Countries have been reluc- 
tant over the past couple of 
decades to share that infor- 
mation in detail with WHO 
but WHO is now reaching 
out to all of them again,’ Lewis says. The 
United States, which supported develop- 
ment of MVA, likely has the biggest supply. 
A federal spokesperson says the Strategic 
National Stockpile has 36,000 doses, that an- 
other 36,000 doses will be delivered “in the 
near future,” and that the company is storing 
bulk material for millions more U.S.-reserved 
doses. A Bavarian Nordic spokesperson says 
many other countries had ordered its MVA 
vaccine in the past weeks and the company 
was trying to send smaller batches to coun- 
tries “so they can start to vaccinate sooner 
rather than later.’ 

How broadly to roll out MVA, or any vac- 
cine, remains the key debate. Ring vaccina- 
tion among MSM can be challenging given 
the stigma faced by that group in many 
cultures and the nature of the contacts. A 
paper published last week in Eurosurveil- 
lance noted that in the United Kingdom, 
many of those infected reported sexual 
contacts with people whose details they ei- 
ther did not know or did not want to share. 

The Canadian province of Quebec has 
already extended vaccinations from direct 
contacts of monkeypox cases to any men 
who have had more than two male sexual 
partners in the past 14 days. Another way 
to tackle the problem, says Yale School 
of Public Health epidemiologist Gregg 
Gonsalves, “would be to offer it to indi- 
viduals who have attended social events in 
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“The truth is, we don’t 
know the efficacy 
of any of these 
monkeypox vaccines.” 


Ira Longini, 
University of Florida 


which close contact with someone infected 
was possible, but this would increase the 
numbers of those being recruited for vacci- 
nation even further.” Germany is also likely 
to offer the vaccine more broadly, though it 
will not quickly have enough vaccine avail- 
able for all MSM, says Leif Erik Sander, 
an infectious disease expert at the Charité 
University Hospital in Berlin. 

Even where contacts of infected 
cases were identified, uptake has been 
low. The same U.K. study reported that 
169 out of 245 health care workers who 
had been offered MVA had taken it, but 
only 15 out of 107 contacts in other groups. 
“It’s very challenging to target high-risk 
groups while balancing stigma and en- 
couraging uptake of the vaccines,” says 
Boghuma Titanji, a virologist at Emory 
University. The politicization of vaccines 
during COVID-19 has 
increased the barriers, 
she adds. 

How well MVA really 
protects humans from 
monkeypox is uncertain. 
The license for MVA in 
Canada and the United 
States is based on ani- 
mal studies, where it was 
shown to protect macaques 
and prairie dogs, plus data in humans 
showing a strong antibody response. A pair 
of DRC studies vaccinated 1600 health care 
workers with one of two MVA formulations 
and found no monkeypox cases in each 
2-year study period. But there were no con- 
trol groups, and one vaccinated health care 
worker did get monkeypox half a year later. 
“The truth is, we don’t know the efficacy of 
any of these monkeypox vaccines,” says Ira 
Longini, a biostatistician at the University 
of Florida who is advising WHO. 

That is why WHO has urged countries 
that deploy monkeypox vaccine to study 
how well it works and how best to use it. 
“If we want to contain these outbreaks and 
learn something about the efficacy of these 
vaccines, it’s going to have to be a con- 
certed effort with protocols and organized 
properly,’ Longini says. One question is 
whether a single dose of the vaccine, which 
is normally given as two doses 4 weeks 
apart, may suffice. That could encourage 
more uptake and stretch supplies. 

The question of vaccine equity looms 
large, too. Titanji notes that the hopes for 
MVA are based partly on the DRC data. “It’s 
almost a moral obligation to make sure 
that, if these vaccines are being utilized 
elsewhere now, the people on whom the 
data was generated, who have been deal- 
ing with monkeypox for 50 years, should 
have access to it as well.” 


Chile’s 
Indigenous 
groups seek 
fairer research 


New constitution may help 
reset relationship between 
scientists and communities 


By Emiliano Rodriguez Mega 


he small fishing settlement of Puerto 
Edén is nestled on Wellington Island 
in southern Chile, among a labyrinth 
of islets and fjords at least a day’s 
journey from the nearest city. But the 
distance and Patagonian cold have 
not discouraged generations of scientists 
from making the trip. Puerto Edén is home 
to some of the Kawésqar, descendants of no- 
madic seafarers. Their culture, territory, the 
remains of their ancestors, and their dying 
language have all drawn academic interest. 

But the goals of researchers and the com- 
munity have sometimes been at odds, says 
Ayelen Tonko Huenucoy, a Kawésqar physi- 
cal anthropologist at the Chilean National 
Museum of Natural History, who partly 
grew up in Puerto Edén. “Several scien- 
tists arrived in a totally conquerorlike way 
... using us for their [own] goals,’ such as 
demanding genetic information from the 
community, she says. 

Now, the Kawésgqar and other Indigenous 
peoples in Chile hope to see their rights rec- 
ognized for the first time in the country’s 
new constitution, which Chileans will vote 
on in areferendum in September. And other 
efforts to balance the relationship between 
Indigenous groups and scientists in Chile are 
underway, including a collaborative work- 
shop last week on ethics and genomics. 

“We will no longer be the guinea pigs,” 
says Elisa Loncon, a Mapuche linguist at the 
University of Santiago and former president 
of the constitutional convention. “And we 
will not be a hindrance to knowledge either.” 

The constitutional process began in 2019, 
when massive protests against inequality 
called for replacing the constitution enacted 
during Augusto Pinochet’s dictatorship in 
1980. If approved, the new constitution 
would make Chile “plurinational,” with at 
least 11 Indigenous groups, representing 
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are the ones that are going to be used 
[in the new outbreak] because they have 
a much-enhanced safety profile,” says 
Marion Gruber, who headed the U.S. Food 
and Drug Administration’s vaccine office 
until October 2021. 

Canada and the United States have 
already licensed MVA for use against 
monkeypox and Bavarian Nordic is in 
talks with the European Medicines Agency 
(EMA). “I really hope that in a matter of 1 or 
2 months from now, this can be approved” 
in Europe, says EMA’s Marco Cavaleri. 

The United Kingdom has been using 
MVA “off-label” for a few years to vaccinate 
contacts of imported monkeypox cases. 
WHO ’s Strategic Advisory Group of Experts 
on Immunization is set to release guidance 
in the next days that will back MVA, but it 
will also recommend using earlier vaccines 
in certain scenarios. Still, 
Cavaleri says, “If [MVA] is 
available, clearly that will 
be the vaccine to start.” 

Exactly how much _ is 
available remains murky. 
“Countries have been reluc- 
tant over the past couple of 
decades to share that infor- 
mation in detail with WHO 
but WHO is now reaching 
out to all of them again,’ Lewis says. The 
United States, which supported develop- 
ment of MVA, likely has the biggest supply. 
A federal spokesperson says the Strategic 
National Stockpile has 36,000 doses, that an- 
other 36,000 doses will be delivered “in the 
near future,” and that the company is storing 
bulk material for millions more U.S.-reserved 
doses. A Bavarian Nordic spokesperson says 
many other countries had ordered its MVA 
vaccine in the past weeks and the company 
was trying to send smaller batches to coun- 
tries “so they can start to vaccinate sooner 
rather than later.’ 

How broadly to roll out MVA, or any vac- 
cine, remains the key debate. Ring vaccina- 
tion among MSM can be challenging given 
the stigma faced by that group in many 
cultures and the nature of the contacts. A 
paper published last week in Eurosurveil- 
lance noted that in the United Kingdom, 
many of those infected reported sexual 
contacts with people whose details they ei- 
ther did not know or did not want to share. 

The Canadian province of Quebec has 
already extended vaccinations from direct 
contacts of monkeypox cases to any men 
who have had more than two male sexual 
partners in the past 14 days. Another way 
to tackle the problem, says Yale School 
of Public Health epidemiologist Gregg 
Gonsalves, “would be to offer it to indi- 
viduals who have attended social events in 
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“The truth is, we don’t 
know the efficacy 
of any of these 
monkeypox vaccines.” 


Ira Longini, 
University of Florida 


which close contact with someone infected 
was possible, but this would increase the 
numbers of those being recruited for vacci- 
nation even further.” Germany is also likely 
to offer the vaccine more broadly, though it 
will not quickly have enough vaccine avail- 
able for all MSM, says Leif Erik Sander, 
an infectious disease expert at the Charité 
University Hospital in Berlin. 

Even where contacts of infected 
cases were identified, uptake has been 
low. The same U.K. study reported that 
169 out of 245 health care workers who 
had been offered MVA had taken it, but 
only 15 out of 107 contacts in other groups. 
“It’s very challenging to target high-risk 
groups while balancing stigma and en- 
couraging uptake of the vaccines,” says 
Boghuma Titanji, a virologist at Emory 
University. The politicization of vaccines 
during COVID-19 has 
increased the barriers, 
she adds. 

How well MVA really 
protects humans from 
monkeypox is uncertain. 
The license for MVA in 
Canada and the United 
States is based on ani- 
mal studies, where it was 
shown to protect macaques 
and prairie dogs, plus data in humans 
showing a strong antibody response. A pair 
of DRC studies vaccinated 1600 health care 
workers with one of two MVA formulations 
and found no monkeypox cases in each 
2-year study period. But there were no con- 
trol groups, and one vaccinated health care 
worker did get monkeypox half a year later. 
“The truth is, we don’t know the efficacy of 
any of these monkeypox vaccines,” says Ira 
Longini, a biostatistician at the University 
of Florida who is advising WHO. 

That is why WHO has urged countries 
that deploy monkeypox vaccine to study 
how well it works and how best to use it. 
“If we want to contain these outbreaks and 
learn something about the efficacy of these 
vaccines, it’s going to have to be a con- 
certed effort with protocols and organized 
properly,’ Longini says. One question is 
whether a single dose of the vaccine, which 
is normally given as two doses 4 weeks 
apart, may suffice. That could encourage 
more uptake and stretch supplies. 

The question of vaccine equity looms 
large, too. Titanji notes that the hopes for 
MVA are based partly on the DRC data. “It’s 
almost a moral obligation to make sure 
that, if these vaccines are being utilized 
elsewhere now, the people on whom the 
data was generated, who have been deal- 
ing with monkeypox for 50 years, should 
have access to it as well.” 


Chile’s 
Indigenous 
groups seek 
fairer research 


New constitution may help 
reset relationship between 
scientists and communities 


By Emiliano Rodriguez Mega 


he small fishing settlement of Puerto 
Edén is nestled on Wellington Island 
in southern Chile, among a labyrinth 
of islets and fjords at least a day’s 
journey from the nearest city. But the 
distance and Patagonian cold have 
not discouraged generations of scientists 
from making the trip. Puerto Edén is home 
to some of the Kawésqar, descendants of no- 
madic seafarers. Their culture, territory, the 
remains of their ancestors, and their dying 
language have all drawn academic interest. 

But the goals of researchers and the com- 
munity have sometimes been at odds, says 
Ayelen Tonko Huenucoy, a Kawésqar physi- 
cal anthropologist at the Chilean National 
Museum of Natural History, who partly 
grew up in Puerto Edén. “Several scien- 
tists arrived in a totally conquerorlike way 
... using us for their [own] goals,’ such as 
demanding genetic information from the 
community, she says. 

Now, the Kawésgqar and other Indigenous 
peoples in Chile hope to see their rights rec- 
ognized for the first time in the country’s 
new constitution, which Chileans will vote 
on in areferendum in September. And other 
efforts to balance the relationship between 
Indigenous groups and scientists in Chile are 
underway, including a collaborative work- 
shop last week on ethics and genomics. 

“We will no longer be the guinea pigs,” 
says Elisa Loncon, a Mapuche linguist at the 
University of Santiago and former president 
of the constitutional convention. “And we 
will not be a hindrance to knowledge either.” 

The constitutional process began in 2019, 
when massive protests against inequality 
called for replacing the constitution enacted 
during Augusto Pinochet’s dictatorship in 
1980. If approved, the new constitution 
would make Chile “plurinational,” with at 
least 11 Indigenous groups, representing 
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more than 2 million people or nearly 13% of 
the population, recognized as autonomous 
communities governing their territories. 
They would in theory have more sway over 
their lands, than, for example, Native Amer- 
icans in the United States, where the federal 
government holds Indigenous land in trust. 

The draft constitution recognizes the 
existence of Indigenous knowledge and 
protects Indigenous peoples’ identities, cul- 
tures, and territories, including nature in 
its “material and immaterial dimensions.” 
It also gives Indigenous peoples the right 
to repatriate objects and human remains, 
and mandates that the Chilean government 
develop mechanisms for such repatriation, 
perhaps including objects from abroad. 

The new constitution isn’t explicit about 
research with Indigenous communities. But 
it could encourage a more collaborative ap- 
proach that considers local and ancestral 
knowledge, says microbiologist Cristina 
Dorador Ortiz, a member of the constitu- 
tional convention that wrote it. 

This stance is new in Chile, where some 
Indigenous people cite past examples of sci- 
entific overreach. “Many times, communities 
complain that research is done on them from 
a Western perspective,’ Dorador Ortiz says. 
For example, in the 1990s, Chilean and Japa- 
nese researchers took blood from Huilliche 
communities, who are part of the Mapuche 
people, in southern Chile. Those samples 
and more than 3500 others from Indigenous 
groups across South America are now in a 
public cell bank at the RIKEN BioResource 
Research Center in Tsukuba, Japan. Cell 
lines derived from the samples, expected to 
be useful for studies on human migration 
or genetic variations in drug response, are 
available to scientists worldwide, with a tube 


Linguist Elisa Loncén carries the Mapuche flag at Chile’s constitutional convention in 2021 in Santiago. 


costing about $110. But donors never saw 
any benefits, Tonko Huenucoy says. 

It’s a story familiar to others in Chile. “The 
way research is done nowadays is super- 
convenient” for scientists, says Constanza 
Silva Gallardo, a biological anthropologist 
at Pennsylvania State University, Univer- 
sity Park, and a member of the Diaguita 
Mapochogasta Autonomous Community in 
Santiago. “There needs to be some sort of 
pushback to bring effective change.” 

The proposed constitution could help set 
the stage, although polls suggest its initial 
high popularity has recently fallen. But even 
if it fails, other efforts are ongoing. In March, 
a mostly Chilean team including Tonko 
Huenucoy and Silva Gallardo published a 
paper in Frontiers in Genetics urging geneti- 
cists to abandon stigmatizing narratives that 
magnify any genetic differences between In- 
digenous people and other Chileans. They 
also called for Chilean universities to de- 
velop protocols to incorporate Indigenous 


Cape Town meeting slams ‘helicopter research’ 


By Cathleen O’Grady 


hen researchers from wealthy 
countries engage in “helicopter 
research” —field research 
in poorer countries that 
extracts data without respectful 
collaboration—they violate research 
integrity as well as pose a moral problem, 
said attendees at last week's World Confer- 
ence on Research Integrity, held in Cape 
Town, South Africa. 
The conference saw the launch of 
the Cape Town Statement on equitable 
research partnerships, which attendees 
will finalize and submit to an academic 
journal. Those at the meeting hope their 
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new framing will elevate the issue and help 
spur systemic solutions, rather than leaving 
the task of building fair collaborations up to 
individual researchers. 

Researchers in low- and middle-income 
countries (LMICs) often feel unappreci- 
ated when they partner with researchers 
from wealthier countries, Francis Kombe, 
co-chair of the African Research Integrity 
Network and a contributor to the state- 
ment, told the conference. Local experts are 
too often not listed as authors, cannot ac- 
cess data they gathered, and lack the power 
to steer research to local priorities, studies 
of the issue have found. All this can affect 
the quality of research. 

Such “scientific colonialism” uses the 


voices in designing sampling procedures, 
drafting informed consent forms, and inter- 
preting results. 

In late 2021, this same group launched 
a program, Ciencia y Comunidades, to im- 
prove ethical standards in genomic studies 
of Indigenous populations in Chile. Last 
week, they held a workshop at the Pontifi- 
cal Catholic University of Chile with mem- 
bers of the Aymara, Diaguita, Colla, Chango, 
Rapa Nui, and Mapuche (including Huilliche 
and Pehuenche) peoples. Between opening 
and closing ceremonies involving traditional 
dances, attendees discussed how research 
is done, who approves projects, and what 
genetic data can and cannot say about a 
person’s identity. The effort was modeled 
after the Summer Internship for Indigenous 
Peoples in Genomics workshop, an inter- 
national consortium that explores the ethics 
of genomics and aims to train Indigenous 
scientists in the field (Science, 28 September 
2018, p. 1304). 


same tactics as colonialism has historically, 
Sue Harrison, deputy vice-chancellor for 
research and internationalization at the 
University of Cape Town, said at the event. It 
extracts data instead of raw materials—and 
undermines and underfunds local infra- 
structure and skills. This leaves researchers 
in LMICs without the publications, patents, 
and skills of their wealthier counterparts. 

The Cape Town Statement will offer a 
guide for how institutions can improve 
collaborations. For example, funders could 
set out expectations for equal authorship 
and data access, says Minal Pathak, a cli- 
mate researcher at Anmedabad University 
in India. 

She hopes the statement has impact. 
“Maybe it’s not new. But maybe we need to 
say it another time.” 
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The goal is to empower communities to 
demand their rights and to “motivate col- 
leagues to work in a different way,’ says 
Constanza de la Fuente, a Chilean ancient 
DNA researcher at the University of Chicago 
and a member of Ciencia y Comunidades. 
“To approach the communities not only say- 
ing, ‘This is an informed consent form, sign 
it and give me your sample; but trying to 
generate a dialogue with them.” 

Although they acknowledge that dialogue 
is needed, some Chilean researchers are 
wary. In other countries, including Canada, 
New Zealand, and the United States, Indig- 
enous communities have asked geneticists 
to delay work, change research questions, 
keep data private, and not publish results. 
“One can’t take absolutist [attitudes]; such 
as insisting that scientists must keep all data 
private, says Lucia Cifuentes, a medical ge- 
neticist at the University of Chile (UCh), San- 
tiago. “Science needs creative freedom.” 

Publishing restrictions would be “censor- 
ship,” says Ricardo Verdugo, a human popu- 
lation geneticist also at UCh Santiago. But he 
thinks a new paradigm is needed. Indigenous 
communities are “the first ones that have the 
right to have a voice,’ he says. “What to ask, 
why ask it, and how I’m going to interpret 
[and] communicate it, is something that ab- 
solutely requires [their] opinion.” 

For others, now is the moment for dras- 
tic measures. “Other scientists might [ques- 
tion] me. But, for me, ethics comes first,’ says 
Macarena Fuentes, a human population 
geneticist at the University of Tarapaca, 
Arica. “For there to be a transition, extreme 
changes must occur.” 

In Puerto Edén, fed up with what they saw 
as one-sided interactions, the community 
created a protocol for scientific research 
within its territory. Scientists must meet 
with a council to explain their research, 
what they’ll do with the results, and how 
Puerto Edén will benefit. They must also re- 
spect Kawésqar culture, including honoring 
taboos against visiting sacred places. And 
they must give something back, whether a 
simple acknowledgement, a share in any fi- 
nancial rewards, or co-authorship. The plan 
may be exceptional in Chile at the moment, 
but many hope it will become the norm in 
the future. 

The protocol isn’t a rejection of science, 
Tonko Huenucoy explains. The community 
even plans to build a science center and 
field station to attract research to the com- 
munity. But they want to make sure that 
its done for and with the Kawésqar, she 
says. So “[Lour] voices are included from the 
very beginning.” 


Emiliano Rodriguez Mega is a journalist 
in Mexico City. 


SCIENCE science.org 


RESEARCH INTEGRITY 


Controversial botanist cleared 


Report sees “insufficient evidence” of misconduct 


By Charles Piller 


hen eight scientists filed a mis- 

conduct complaint against promi- 

nent botanist Steven Newmaster 

with the University of Guelph (UG) 

in June 2021, they thought they 

had an ironclad case. Their claim 
that Newmaster, whose work profoundly in- 
fluenced how dietary supplements are tested 
and marketed, had made up or plagiarized 
data in three papers was “an entirely cred- 
ible and well-founded allegation,” says co- 
signatory Kenneth Thompson, a postdoc at 
Stanford University. 

But an investigative committee at UG 
disagrees. Newmaster “displayed a pattern 
of poor judgement,’ his conduct was “suspi- 
cious,” and there were “many shortcomings” 
in his work, panel chairman John Walsh, a 
business professor at UG, wrote in a 1 June 
letter to the complainants. But there was 
“insufficient evidence” to find 
Newmaster guilty of misconduct. 

“Given the evidence of data fal- 
sification that was assembled, I 
was very surprised at the conclu- 
sion,’ says evolutionary biologist 
Paul Hebert, a co-signatory to 
the complaint. Hebert directs 
UG’s Centre for Biodiversity 
Genomics and pioneered DNA barcoding, 
a technique for identifying organisms from 
short snippets of DNA that is central to 
Newmaster’s work. 

But Thomas Braukmann, a former postdoc 
at the UG center who is now at Public Health 
Ontario, says the findings are “disappointing 
but not surprising.” “I don’t think [UG] took 
the allegations seriously or put the commit- 
tee together in good faith,’ says Braukmann, 
who was not involved in the case but studied 
Newmaster’s papers at Science’s request. “We 
need a better system in Canada to handle 
misconduct concerns.” Walsh and UG did not 
respond to requests for comment. 

Newmaster made headlines with a 2013 
BMC Medicine study reporting that many 
herbal supplements didn’t contain the la- 
beled ingredients and some had toxic con- 
taminants. The paper propelled him to global 
fame as a testing expert. His own companies 
and a nonprofit group at UG raised millions 
of dollars by certifying supplements, canna- 
bis, and other comestibles. 

Thompson was the first to level accu- 


“|was very 
surprised at the 


conclusion.” 


Paul Hebert, 
University of Guelph 


sations against Newmaster, in 2020. He 
claimed Newmaster made up and plagiarized 
the data for a study testing the ability of DNA 
barcoding to identify species in a Canadian 
forest, which they published together in 2014, 
when Thompson was a UG undergraduate. 
After the university dismissed his complaint, 
the group of eight sent a more elaborate com- 
plaint letter. It fingered not only Thompson’s 
paper—which the journal Biodiversity and 
Conservation retracted in October 2021—but 
also the supplements study and a 2013 pa- 
per using DNA barcoding to study woodland 
caribou diets. Newmaster’s co-authors have 
requested retractions of those articles as well. 

For now, BMC Medicine is investigating 
the allegations about the supplements ar- 
ticle and the Canadian Journal of Forest Re- 
search has added an Expression of Concern 
to the caribou paper. Newmaster has denied 
the charges. “I have never engaged in any 
unethical activity or academic misconduct,” 
he wrote in an official reply ob- 
tained by Science. 

An investigation by Science 
(3 February, p. 484) revealed 
many other cases in which 
Newmaster appeared to ma- 
nipulate or fabricate data, pla- 
giarize, and invent elements of 
his academic record. He did 
not respond to requests for comment for 
the Science story, and the panel did not ad- 
dress the issues it raised. UG could take up 
to several months to issue a final decision. 

Newmaster’s accusers had been worried 
the investigation might not be rigorous, given 
the panel’s lack of expertise in genomics. In 
addition to Walsh, it included Jeff Wichtel, 
dean of UG’s veterinary college, and Cynthia 
Fekken, a psychologist from Queen’s Univer- 
sity. Walsh’s letter says they relied on an in- 
dependent expert witness, whom he did not 
name. The letter notes that a “key factor” in 
the panel’s finding that it was impossible to 
“definitely establish” misconduct was the “ab- 
sence of records, including raw data.” 

That’s a particularly frustrating argument, 
Thompson says. “That was the essence of our 
complaint. We knew they wouldn’t be able to 
find records. Our complaint alleged that Prof. 
Newmaster falsified his work and never had 
the data to back it up.” 


This story was supported by the Science Fund 
for Investigative Reporting. 
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more than 2 million people or nearly 13% of 
the population, recognized as autonomous 
communities governing their territories. 
They would in theory have more sway over 
their lands, than, for example, Native Amer- 
icans in the United States, where the federal 
government holds Indigenous land in trust. 

The draft constitution recognizes the 
existence of Indigenous knowledge and 
protects Indigenous peoples’ identities, cul- 
tures, and territories, including nature in 
its “material and immaterial dimensions.” 
It also gives Indigenous peoples the right 
to repatriate objects and human remains, 
and mandates that the Chilean government 
develop mechanisms for such repatriation, 
perhaps including objects from abroad. 

The new constitution isn’t explicit about 
research with Indigenous communities. But 
it could encourage a more collaborative ap- 
proach that considers local and ancestral 
knowledge, says microbiologist Cristina 
Dorador Ortiz, a member of the constitu- 
tional convention that wrote it. 

This stance is new in Chile, where some 
Indigenous people cite past examples of sci- 
entific overreach. “Many times, communities 
complain that research is done on them from 
a Western perspective,’ Dorador Ortiz says. 
For example, in the 1990s, Chilean and Japa- 
nese researchers took blood from Huilliche 
communities, who are part of the Mapuche 
people, in southern Chile. Those samples 
and more than 3500 others from Indigenous 
groups across South America are now in a 
public cell bank at the RIKEN BioResource 
Research Center in Tsukuba, Japan. Cell 
lines derived from the samples, expected to 
be useful for studies on human migration 
or genetic variations in drug response, are 
available to scientists worldwide, with a tube 


Linguist Elisa Loncén carries the Mapuche flag at Chile’s constitutional convention in 2021 in Santiago. 


costing about $110. But donors never saw 
any benefits, Tonko Huenucoy says. 

It’s a story familiar to others in Chile. “The 
way research is done nowadays is super- 
convenient” for scientists, says Constanza 
Silva Gallardo, a biological anthropologist 
at Pennsylvania State University, Univer- 
sity Park, and a member of the Diaguita 
Mapochogasta Autonomous Community in 
Santiago. “There needs to be some sort of 
pushback to bring effective change.” 

The proposed constitution could help set 
the stage, although polls suggest its initial 
high popularity has recently fallen. But even 
if it fails, other efforts are ongoing. In March, 
a mostly Chilean team including Tonko 
Huenucoy and Silva Gallardo published a 
paper in Frontiers in Genetics urging geneti- 
cists to abandon stigmatizing narratives that 
magnify any genetic differences between In- 
digenous people and other Chileans. They 
also called for Chilean universities to de- 
velop protocols to incorporate Indigenous 


Cape Town meeting slams ‘helicopter research’ 


By Cathleen O’Grady 


hen researchers from wealthy 
countries engage in “helicopter 
research” —field research 
in poorer countries that 
extracts data without respectful 
collaboration—they violate research 
integrity as well as pose a moral problem, 
said attendees at last week's World Confer- 
ence on Research Integrity, held in Cape 
Town, South Africa. 
The conference saw the launch of 
the Cape Town Statement on equitable 
research partnerships, which attendees 
will finalize and submit to an academic 
journal. Those at the meeting hope their 
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new framing will elevate the issue and help 
spur systemic solutions, rather than leaving 
the task of building fair collaborations up to 
individual researchers. 

Researchers in low- and middle-income 
countries (LMICs) often feel unappreci- 
ated when they partner with researchers 
from wealthier countries, Francis Kombe, 
co-chair of the African Research Integrity 
Network and a contributor to the state- 
ment, told the conference. Local experts are 
too often not listed as authors, cannot ac- 
cess data they gathered, and lack the power 
to steer research to local priorities, studies 
of the issue have found. All this can affect 
the quality of research. 

Such “scientific colonialism” uses the 


voices in designing sampling procedures, 
drafting informed consent forms, and inter- 
preting results. 

In late 2021, this same group launched 
a program, Ciencia y Comunidades, to im- 
prove ethical standards in genomic studies 
of Indigenous populations in Chile. Last 
week, they held a workshop at the Pontifi- 
cal Catholic University of Chile with mem- 
bers of the Aymara, Diaguita, Colla, Chango, 
Rapa Nui, and Mapuche (including Huilliche 
and Pehuenche) peoples. Between opening 
and closing ceremonies involving traditional 
dances, attendees discussed how research 
is done, who approves projects, and what 
genetic data can and cannot say about a 
person’s identity. The effort was modeled 
after the Summer Internship for Indigenous 
Peoples in Genomics workshop, an inter- 
national consortium that explores the ethics 
of genomics and aims to train Indigenous 
scientists in the field (Science, 28 September 
2018, p. 1304). 


same tactics as colonialism has historically, 
Sue Harrison, deputy vice-chancellor for 
research and internationalization at the 
University of Cape Town, said at the event. It 
extracts data instead of raw materials—and 
undermines and underfunds local infra- 
structure and skills. This leaves researchers 
in LMICs without the publications, patents, 
and skills of their wealthier counterparts. 

The Cape Town Statement will offer a 
guide for how institutions can improve 
collaborations. For example, funders could 
set out expectations for equal authorship 
and data access, says Minal Pathak, a cli- 
mate researcher at Anmedabad University 
in India. 

She hopes the statement has impact. 
“Maybe it’s not new. But maybe we need to 
say it another time.” 
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The goal is to empower communities to 
demand their rights and to “motivate col- 
leagues to work in a different way,’ says 
Constanza de la Fuente, a Chilean ancient 
DNA researcher at the University of Chicago 
and a member of Ciencia y Comunidades. 
“To approach the communities not only say- 
ing, ‘This is an informed consent form, sign 
it and give me your sample; but trying to 
generate a dialogue with them.” 

Although they acknowledge that dialogue 
is needed, some Chilean researchers are 
wary. In other countries, including Canada, 
New Zealand, and the United States, Indig- 
enous communities have asked geneticists 
to delay work, change research questions, 
keep data private, and not publish results. 
“One can’t take absolutist [attitudes]; such 
as insisting that scientists must keep all data 
private, says Lucia Cifuentes, a medical ge- 
neticist at the University of Chile (UCh), San- 
tiago. “Science needs creative freedom.” 

Publishing restrictions would be “censor- 
ship,” says Ricardo Verdugo, a human popu- 
lation geneticist also at UCh Santiago. But he 
thinks a new paradigm is needed. Indigenous 
communities are “the first ones that have the 
right to have a voice,’ he says. “What to ask, 
why ask it, and how I’m going to interpret 
[and] communicate it, is something that ab- 
solutely requires [their] opinion.” 

For others, now is the moment for dras- 
tic measures. “Other scientists might [ques- 
tion] me. But, for me, ethics comes first,’ says 
Macarena Fuentes, a human population 
geneticist at the University of Tarapaca, 
Arica. “For there to be a transition, extreme 
changes must occur.” 

In Puerto Edén, fed up with what they saw 
as one-sided interactions, the community 
created a protocol for scientific research 
within its territory. Scientists must meet 
with a council to explain their research, 
what they’ll do with the results, and how 
Puerto Edén will benefit. They must also re- 
spect Kawésqar culture, including honoring 
taboos against visiting sacred places. And 
they must give something back, whether a 
simple acknowledgement, a share in any fi- 
nancial rewards, or co-authorship. The plan 
may be exceptional in Chile at the moment, 
but many hope it will become the norm in 
the future. 

The protocol isn’t a rejection of science, 
Tonko Huenucoy explains. The community 
even plans to build a science center and 
field station to attract research to the com- 
munity. But they want to make sure that 
its done for and with the Kawésqar, she 
says. So “[Lour] voices are included from the 
very beginning.” 


Emiliano Rodriguez Mega is a journalist 
in Mexico City. 
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Controversial botanist cleared 


Report sees “insufficient evidence” of misconduct 


By Charles Piller 


hen eight scientists filed a mis- 

conduct complaint against promi- 

nent botanist Steven Newmaster 

with the University of Guelph (UG) 

in June 2021, they thought they 

had an ironclad case. Their claim 
that Newmaster, whose work profoundly in- 
fluenced how dietary supplements are tested 
and marketed, had made up or plagiarized 
data in three papers was “an entirely cred- 
ible and well-founded allegation,” says co- 
signatory Kenneth Thompson, a postdoc at 
Stanford University. 

But an investigative committee at UG 
disagrees. Newmaster “displayed a pattern 
of poor judgement,’ his conduct was “suspi- 
cious,” and there were “many shortcomings” 
in his work, panel chairman John Walsh, a 
business professor at UG, wrote in a 1 June 
letter to the complainants. But there was 
“insufficient evidence” to find 
Newmaster guilty of misconduct. 

“Given the evidence of data fal- 
sification that was assembled, I 
was very surprised at the conclu- 
sion,’ says evolutionary biologist 
Paul Hebert, a co-signatory to 
the complaint. Hebert directs 
UG’s Centre for Biodiversity 
Genomics and pioneered DNA barcoding, 
a technique for identifying organisms from 
short snippets of DNA that is central to 
Newmaster’s work. 

But Thomas Braukmann, a former postdoc 
at the UG center who is now at Public Health 
Ontario, says the findings are “disappointing 
but not surprising.” “I don’t think [UG] took 
the allegations seriously or put the commit- 
tee together in good faith,’ says Braukmann, 
who was not involved in the case but studied 
Newmaster’s papers at Science’s request. “We 
need a better system in Canada to handle 
misconduct concerns.” Walsh and UG did not 
respond to requests for comment. 

Newmaster made headlines with a 2013 
BMC Medicine study reporting that many 
herbal supplements didn’t contain the la- 
beled ingredients and some had toxic con- 
taminants. The paper propelled him to global 
fame as a testing expert. His own companies 
and a nonprofit group at UG raised millions 
of dollars by certifying supplements, canna- 
bis, and other comestibles. 

Thompson was the first to level accu- 


“|was very 
surprised at the 


conclusion.” 


Paul Hebert, 
University of Guelph 


sations against Newmaster, in 2020. He 
claimed Newmaster made up and plagiarized 
the data for a study testing the ability of DNA 
barcoding to identify species in a Canadian 
forest, which they published together in 2014, 
when Thompson was a UG undergraduate. 
After the university dismissed his complaint, 
the group of eight sent a more elaborate com- 
plaint letter. It fingered not only Thompson’s 
paper—which the journal Biodiversity and 
Conservation retracted in October 2021—but 
also the supplements study and a 2013 pa- 
per using DNA barcoding to study woodland 
caribou diets. Newmaster’s co-authors have 
requested retractions of those articles as well. 

For now, BMC Medicine is investigating 
the allegations about the supplements ar- 
ticle and the Canadian Journal of Forest Re- 
search has added an Expression of Concern 
to the caribou paper. Newmaster has denied 
the charges. “I have never engaged in any 
unethical activity or academic misconduct,” 
he wrote in an official reply ob- 
tained by Science. 

An investigation by Science 
(3 February, p. 484) revealed 
many other cases in which 
Newmaster appeared to ma- 
nipulate or fabricate data, pla- 
giarize, and invent elements of 
his academic record. He did 
not respond to requests for comment for 
the Science story, and the panel did not ad- 
dress the issues it raised. UG could take up 
to several months to issue a final decision. 

Newmaster’s accusers had been worried 
the investigation might not be rigorous, given 
the panel’s lack of expertise in genomics. In 
addition to Walsh, it included Jeff Wichtel, 
dean of UG’s veterinary college, and Cynthia 
Fekken, a psychologist from Queen’s Univer- 
sity. Walsh’s letter says they relied on an in- 
dependent expert witness, whom he did not 
name. The letter notes that a “key factor” in 
the panel’s finding that it was impossible to 
“definitely establish” misconduct was the “ab- 
sence of records, including raw data.” 

That’s a particularly frustrating argument, 
Thompson says. “That was the essence of our 
complaint. We knew they wouldn’t be able to 
find records. Our complaint alleged that Prof. 
Newmaster falsified his work and never had 
the data to back it up.” 


This story was supported by the Science Fund 
for Investigative Reporting. 
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U.S. RESEARCH FUNDING 


At NSF, what price geographic diversity? 


U.S. innovation bills present Congress with dueling visions for funding have-not states 


By Jeffrey Mervis 


op research universities in a handful 
of U.S. states conduct the majority of 
research funded by the National Sci- 
ence Foundation (NSF), whereas in- 
stitutions in half the country receive 
only crumbs. 

The Senate and House of Representatives 
both want to redress the imbalance, but law- 
makers in the two chambers have proposed 
different solutions. Those dueling visions 
must be reconciled in negotiations now un- 
derway to finalize a massive bill aimed at 
bolstering U.S. competitiveness with China 
in research and high-tech manufacturing 
that is a top priority for President Joe Biden. 

The U.S. Senate thinks the 
solution is to mandate an eight- 
fold increase in an_ existing 
initiative—the Established Pro- 
gram to Stimulate Competitive 
Research (EPSCoR)—that serves 
only the have-not jurisdictions 
(25 states, Puerto Rico, Guam, and 
the Virgin Islands). To finance 
that huge growth in EPSCoR, 
NSF would need to shrink core 
research programs in which the 
money is awarded competitively. 

More than 200 universities 
and more than 100 members of 
Congress have launched a last- 
minute fight to remove the Sen- 
ate language from the final bill. 
“Arbitrarily walling off a sizable 
percentage of a science agency’s 
budget from a sizable majority of the 
country’s research institutions would fun- 
damentally reduce the entire nation’s sci- 
entific capacity,’ warned 18 senators and 
78 House members in a 24 May letter that 
echoes a 2 April plea to lawmakers from 
a coalition of institutions in non-ESPCoR 
states. Curtailing existing NSF programs, 
they say, would harm many less research- 
intensive institutions located outside of 
have-not states. 

The critics prefer the House approach. It 
has proposed new programs to help poorly 
funded institutions in every state. But the 
Senate version may have the upper hand 
in negotiations. It is tucked into the U.S. 
Innovation and Competition Act (USICA), 
which passed in June 2021 with strong bi- 
partisan support. The House version, the 
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America COMPETES Act, was approved 
by a narrow margin in February with only 
Democratic members in favor. 

NSF says roughly 13% of its research 
funds now go to institutions in EPSCoR 
states. But only about one-fifth of that 
comes directly from EPSCoR. The bulk 
is awarded competitively through NSF’s 
regular research and training programs. 
The Senate bill would allocate 20% of 
NSF’s overall budget, currently $8.8 bil- 
lion, to EPSCoR. If the rule were in place 
now, it would hike EPSCoR’s budget from 
$215 million to roughly $1.75 billion. 

Last month, NSF Director Sethuraman 
Panchanathan told a Senate spending 
panel he has “an aspirational goal” of send- 


i 


='\ 


Senator Maria Cantwell (D-WA, right) is trying to steer a massive bill 
through Congress that contains a controversial provision for geographic 
diversity in research funding crafted by Senator Roger Wicker (R-MS). 


ing 20% of NSF’s research budget to have- 
not states. But that approach to increasing 
geographic diversity didn’t satisfy the pan- 
el’s chair, Senator Jean Shaheen (D-NH), 
who instructed her Senate colleagues ne- 
gotiating the bill “to hold tight to the 20% 
requirement” for EPSCoR itself. 

Shaheen was one of 33 Senators and 
26 House members, all from EPSCoR 
states, who last fall signed a letter to ne- 
gotiators arguing for preserving the Senate 
language. “If the United States is going to 
stay a step ahead of China, we need to pro- 
mote the scientific talent, expertise, and 
capabilities found throughout America, 
not just in a handful of states,’ asserts 
Senator Roger Wicker (R-MS), who spear- 
headed the letter and crafted the EPSCoR 
provision in the Senate bill. 


his 


Senator Maria Cantwell (D-WA), co- 
chair of the 107-member negotiating 
committee and chair of the Senate com- 
merce committee that oversees NSF, has 
not taken a position on the set-aside but 
says the geographic imbalance “is some- 
thing that we have to address.” And last 
month, at the conference committee’s first 
meeting, she praised Wicker’s “passion for 
spending federal research dollars in areas 
defined by EPSCoR.” 

In contrast, the House’s COMPETES Act 
applauds EPSCoR for “improving research 
capacity and competitiveness” but gives 
the program no specific set-aside. Instead, 
it would create several NSF programs 
aimed at fostering greater geographic di- 

versity in research. 

j One would let institutions not 
in the top 100 recipients of fed- 
eral research dollars—in 2020 
that meant receiving $255 million 
or more—compete for money to 
conduct research, recruit faculty, 
offer student stipends, and carry 
out “other activities necessary to 
build research capacity.’ Another 
would fund joint projects be- 
tween research powerhouses and 
“emerging research institutions’— 
defined as those receiving less than 
$35 million annually in federal re- 
search funding. 

Despite a budget that has 
tripled since 2001, EPSCoR has 
not reshaped the geography of 
NSF’s spending. In 2020, just 
five states—California, Massachusetts, 
New York, Texas, and Maryland—received 
nearly 40% of NSF’s research dollars, 
whereas the bottom five—Vermont, West 
Virginia, Wyoming, and North and South 
Dakota—together got less than 1%. 

The Senate’s views on EPSCoR carry 
weight because the final bill will need the 
backing of most of the 19 Senate Republi- 
cans who voted for USICA. Biden has even 
renamed the proposed legislation, calling 
it the Bipartisan Innovation Act in hopes 
of retaining Republican support. 

Democratic leaders have asked confer- 
ees to reach agreement this month. With 
so much to like in the bills, science ad- 
vocates also want to see the legislation 
enacted—but not at the cost of disrupting 
how NSF funds research. 
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Volcanic glass, like that found near Iceland’s Blue Lagoon, can help knit RNA letters into long strands. 


BIOCHEMISTRY 


Did black volcanic rock help spark early life? 


Quenched lava may have helped form long RNA strands vital to primordial organisms 


By Robert F. Service 


hen life emerged, it did so quickly. 

Fossils suggest microbes were 

present 3.7 billion years ago, just 

a few hundred million years after 

the 4.5-billion-year-old planet had 

cooled enough to support bio- 
chemistry. Many researchers think the he- 
reditary material for these first organisms 
was RNA. Although not as complex as DNA, 
RNA would still be difficult to forge into the 
long strands needed to convey genetic infor- 
mation, raising the question of how it could 
have spontaneously formed. 

Now, researchers may have an answer. 
In lab experiments, they show how rocks 
called basaltic glasses help individual RNA 
letters, known as nucleoside triphosphates, 
link into strands up to 200 letters long. The 
glasses would have been abundant in the fire 
and brimstone of early Earth; they are cre- 
ated when lava is quenched in air or water 
or when the melted rock created in asteroid 
strikes cools off rapidly. 

The result has divided top origin-of-life 
researchers. “This seems to be a wonderful 
story that finally explains how the nucleoside 
triphosphates react with each other to give 
RNA strands,” says Thomas Carell, a chem- 
ist at the Ludwig Maximilian University of 
Munich. But Jack Szostak, an RNA expert at 
Harvard University, says he won't believe the 
result until the research team better charac- 
terizes the RNA strands. 
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Origin-of-life researchers are fond of a pri- 
mordial “RNA world” because the molecule 
can carry out two distinct processes vital for 
life. Like DNA, it’s made up of four chemical 
letters that can carry genetic information. 
And like proteins, RNA can catalyze chemical 
reactions needed for life. 

But RNA also brings headaches. No one has 
found a set of plausible prebiotic conditions 
that would cause hundreds of RNA letters— 
each of them complex molecules—to link into 
strands long enough to support the complex 
chemistry needed to ignite evolution. 

Stephen Mojzsis, a geologist now at the Re- 
search Centre for Astronomy and Earth Sci- 
ences of the Hungarian Academy of Sciences, 
wondered whether basaltic glasses played a 
role. They are rich in metals such as magne- 
sium and iron that promote many chemical 
reactions. And, he says, “Basaltic glass was 
everywhere on Earth at the time.” 

He sent samples of five different basaltic 
glasses to the Foundation for Applied Mo- 
lecular Evolution. There, Elisa Biondi, a mo- 
lecular biologist, and her colleagues ground 
each sample into a fine powder, sterilized 
it, and mixed it with a solution of nucleo- 
side triphosphates. Without a glass powder 
present, the RNA letters failed to link up. 
But when mixed with the glass powders, 
the molecules joined into long strands, 
some hundreds of letters long, the research- 
ers report last week in Astrobiology. No 
heat or light was needed. “All we had to do 
was wait,’ Biondi says. Small RNA strands 


formed after just a day, but strands kept 
growing for months. “The beauty of this 
model is its simplicity” says Jan Spacek, a 
molecular biologist at Firebird Biomolecu- 
lar Sciences. “Mix the ingredients, wait for a 
few days, and detect the RNA.” 

Still, the results leave questions unan- 
swered. One is how the nucleoside triphos- 
phates could have arisen in the first place. 
Biondi’s colleague Steven Benner says recent 
research shows how the same basaltic glasses 
could have promoted the formation and sta- 
bilization of the individual RNA letters. 

A bigger issue, Szostak says, is the shape of 
the RNA strands. In modern cells, enzymes 
ensure most RNAs grow into linear chains. 
But RNA can also bind in complex branch- 
ing patterns. Szostak wants the researchers 
to report which type of RNA the basaltic 
glasses created. “I find it very frustrating that 
the authors have made an interesting initial 
finding but then decided to go with the hype 
rather than the science,” he says. 

Biondi admits her team’s experiment al- 
most certainly produces a small amount of 
RNA branching. However, she notes that 
some branched RNAs exist in organisms to- 
day, and related structures may have been 
present at life’s dawn. She also says other 
tests the group performed confirm the pres- 
ence of long strands with connections that 
most likely mean they are linear. “It’s a 
healthy debate,” says Dieter Braun, an origin- 
of-life chemist at Ludwig Maximilian. “It will 
trigger the next round of experiments.” & 
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AWILD HOPE 


Two decades after it disappeared in nature, 
the stunning blue Spix’s macaw will be reintroduced to its forest home 


n 1995, conservationists and scientists 
embarked on a desperate attempt to 
save the world’s rarest bird, a blue- 
gray parrot called the Spix’s macaw. 
The bird had scarcely been spotted 
since scientists first described it in the 
early 19th century, and it had taken on 
an aura of mystery, making it irresist- 
ible to parrot lovers—and to poachers. 
“For well over a century we just had this 
very, very weak information that there was 
this kind of mythical, rather beautiful blue 
bird,” says Nigel Collar, a conservationist 
at BirdLife International. By the mid-1990s 
only a single individual remained alive in 
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By Kai Kupferschmidt, 
in Curaca, Brazil 


the wild, close to this dusty, small town in 
northeastern Brazil. 

From DNA in molted feathers, researchers 
in the United Kingdom confirmed that the 
last wild bird was a male. At the time, fewer 
than three dozen birds were known to be held 
in collections and zoos around the world, 
and a decision was made to release a single 
female in hopes the birds would pair and 
produce offspring. The female was released 
close to where the male lived and seemed 
to quickly adapt to her new life, eating wild 


food and avoiding an attack by a falcon. She 
grew stronger by the day, flying farther and 
farther, and after little more than 2 months 
had paired with the male. Two weeks later, 
she mysteriously disappeared. 

Years later, a local man said he had found 
the bird dead below a power line. “If that’s 
really true, then that is just incredibly bad 
luck,” Collar says. It is almost unheard of for 
parrots to hit power cables, he says, and in 
reality she might have been taken by poach- 
ers. “The world of Spix’s macaw is full of very, 
very great uncertainties and a lot of people 
who say a lot of things that they don’t neces- 
sarily really mean.” The wild male vanished 
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a few years later, and the Spix’s 
fate seemed sealed—another 
species lost. 

Now, conservationists are at- 
tempting to undo that fate. On 
11 June, more than a quarter- 
century after the female flew 
into oblivion, they plan to re- 
lease eight Spix’s macaws from 
captivity into the wild. Twelve 
more are supposed to follow 
at the end of the year and still 
more in the years to come. If 
everything goes according to 
plan, these birds will be the 
vanguard of a new population 
of Spix’s macaws in their natu- 
ral habitat. The project, long 
hampered by infighting and 
overshadowed by controversy, 
had to overcome significant 
scientific hurdles to even come 
this far. But the biggest chal- 
lenge still lies ahead. 

“The Spix’s project is unique 
in that they are reintroducing 
a species back into the wild 
that is currently extinct, has 
been extinct in the wild for over 
2 decades,’ says Thomas White, 
a wildlife biologist at the U.S. 
Fish and Wildlife Service and a 
technical adviser to the project. 
“There’s very few reintroduction 
programs around the world that 
have done something like that, 


none with parrots or macaws.” ebeiccs eee cy erie : ae 
In the wild, Spix’s macaws nest in the hollows of large caraibeira trees, which grow 
along streams in the dry forest of northeastern Brazil. 


Few reintroductions of birds 
have been successful, and none 
was as ambitious as this one, 
says George Amato, a conservation bio- 
logist at the American Museum of Natural 
History. Yet for the Spix’s it has to be tried, 
he says. “I hope it works, because we really 
have no other alternatives.” 


THE NATURAL HOME of the Spix’s macaw 
is in the caatinga, a tropical dry forest in 
northeastern Brazil that covers 10% of the 
country. In the rainy season, which lasts for 
about 2 months, everything appears lush 
and green. But the rest of the year plants 
here compete in shades of gray and white— 
caatinga means “white forest” in the Indige- 
nous Tupi language. It is here that the Spix’s 
macaws once nested in the hollows of old 
caraibeira trees growing along the creeks 
that cut through the caatinga, feeding on 
seeds and nuts. 

It is impossible to know how many Spix’s 
macaws existed in the past. By the time 
Western science discovered the bird, hu- 
mans had already started to parcel large 
parts of the caatinga into ranches. In 1819, 
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German naturalist Johann Baptist von Spix 
spotted the parrot on an expedition to the 
interior of Brazil. Spix noted that the bird 
appeared to be “very rare’—then shot it and 
brought it home to Munich, setting the tone 
for humanity’s relationship with this strik- 
ing bird going forward. 

As the human footprint increased in the 
caatinga, the bird became even rarer. Tragi- 
cally, this only made it more coveted by 
parrot collectors, who were willing to pay 
tens of thousands of dollars for a single 
bird. “The rarer it was, the more it became 
a kind of status symbol,” Collar says. The 
bird became something akin to the exceed- 
ingly rare blue Mauritius stamp coveted by 
philatelists, says Roland Wirth, a conserva- 
tionist at the Zoological Society for the Con- 
servation of Species and Populations. “The 
very wealthy, very passionate collectors 
really wanted to have one, and they would 
do almost anything to do so.” 

By the beginning of 1987, only three Spix’s 
macaws were known to survive in the wild, 


and by the end of that year, 
poachers had taken two of them. 
After the plan to pair the last 
male with a captive bird failed 
in 1995, the male remained with 
a female of a different species, 
an Illiger’s macaw, until he, too, 
disappeared in October 2000. 
The International Union for 
Conservation of Nature officially 
declared the Spix’s macaw ex- 
tinct in the wild in 2019, exactly 
200 years after Spix had de- 
scribed it. 

Even then, the bird retained 
its hold on the popular imagi- 
nation. The story of the last 
lone male inspired songs— 
including one written from the 
perspective of the Illiger’s fe- 
male waiting in vain for his re- 
turn—and two animated movies 
that together earned $1 billion. 


ON A HOT MORNING in Febru- 
ary, Martin Guth, a bald and 
burly German businessman and 
parrot collector, stood in the 
spot where the Spix’s will be- 
gin its new life in the wild. The 
nongovernmental organization 
(NGO) he founded, the Asso- 
ciation for the Conservation of 
Threatened Parrots (ACTP), has 
taken on the challenge of bring- 
ing the bird back to the caatinga. 
ACTP, which houses more than 
170 Spix’s macaws in ‘Tasdorf, 
near Berlin, built a facility just a 
few hundred meters from where 
Guth is standing and, in March 2020, flew 
52 macaws to Brazil by private jet to take up 
residence there. In 2021, three chicks hatched 
at the facility, the first Spix’s born in the bird’s 
original home in more than 30 years. 

But that morning, Guth was angry. 
Nearby, workers were busy constructing a 
huge U-shaped aviary where the birds will 
be able to fly longer distances than they can 
in their small cages inside the main facility. 
It was running behind schedule. “Even on 
the way here, the guy still said everything 
was finished,” Guth grumbled. He was con- 
vinced that a rival who was previously in- 
volved in the Spix’s project had something 
to do with the delay. The Spix’s project may 
have high-minded goals, but its history is 
replete with jealousies and backbiting. 

The idea of breeding Spix’s macaws in 
captivity and reintroducing them to the 
wild began long before Guth’s involvement, 
and even before the lone wild male had dis- 
appeared. In 1990, conservationists formed 
a committee to oversee a reintroduction 
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program. That meant building up an ad- 
equate captive population, which proved to 
be a complicated and controversial process. 

At first conservationists only knew of a few 
captive birds—and owners were reluctant to 
come forward, because the export of wild- 
life had been illegal in Brazil since 1967. But 
the Brazilian government agreed to grant 
amnesty to owners if their birds joined the 
breeding program, and “one by one, people 
came out and admitted they had Spix’s ma- 
caws,” says Wolfgang Kiessling, a business- 
man who founded and runs Loro Parque, 
a private zoo on the island of Tenerife that 
held some Spix’s macaws for many years. 

Still, by 1996 only 39 captive birds were 
known around the world. Making matters 
worse, most of them were closely related. 
Only nine of the birds had come from the 
wild, and 21 of the remaining 30 
were offspring from a single pair 
in the Philippines, raising con- 
cerns about inbreeding. For the 
Spix’s to have any future, birds 
from different collectors needed to 
be brought together to breed, but 
arguments over who would send a 
bird to whom under what condi- 
tions kept derailing the plans. “The 
rarer the animal, the more politics 
is involved,” says Cristina Miyaki, 
a bird geneticist and a member of 
the advisory committee of the Spix’s 
project. In contrast to the spirit of 
cooperation required for a success- 
ful recovery effort, Collar wrote in 
1992, “ownership is a matter of jeal- 
ousy, prestige and possessiveness 
that is fundamentally different in 
psychological origin.” 

Meanwhile, the constellation of owners 
kept changing. Starting in 2000, Sheikh 
Saoud Bin Mohammed Bin Ali Al-Thani of 
Qatar bought dozens of Spix’s to keep at 
Al Wabra, his private wildlife preserve. In 
time, he came to own the vast majority of all 
known Spix’s macaws in the world. 

Guth entered the scene in 2005, beating 
out the sheikh to buy from a private Swiss 
owner three Spix’s macaws that had not pre- 
viously been part of the breeding program. 
“The three birds he had were the most 
important ones, because they could im- 
prove the genetics of the population,” says 
Camile Lugarini, a veterinarian at the 
Chico Mendes Institute for Biodiversity 
Conservation (ICMBio), who leads the 
Spix’s macaw project for the Brazilian Min- 
istry of the Environment. 

In May 2012, a meeting in Brazil’s capital, 
Brasilia, brought together representatives of 
all the important stakeholders. It was testy. 
One participant argued that Guth should 
have no part in the project because he had 
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served a prison sentence and, this person 
claimed, had sold endangered birds ille- 
gally, in violation of the Convention on In- 
ternational Trade in Endangered Species of 
Wild Fauna and Flora. (Guth says that like 
other breeders and NGOs, ACTP sells some 
birds legally, but has never sold Spix’s or 
other highly endangered birds, and that his 
offenses were committed decades ago and 
have nothing to do with the current project.) 
Tim Bouts, a veterinarian who was then the 
curator at Al Wabra and attended the meet- 
ing, says he spoke in defense of Guth, who 
was not present: “Let’s be honest, this table 
here is full of criminals. Every single Spix’s 
that came into captivity was illegal.” 

The meeting ended with no agreement. 

Guth has pressed ahead, even as some 
have questioned his motives and methods, 


Lost ground 
The Spix's macaw's native home is the caatinga, a dry tropical forest 
that is leafless most of the year. The forest has dwindled because of 
ranching, another obstacle facing the effort to reintroduce the macaw. 
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pointing to the lack of transparency around 
ACTP and its sources of funding. Guth says 
some donors prefer to remain anonymous 
and that he is trying to avoid the disputes 
over funding and credit that doomed the 
project in the past. “Yes, we are doing things 
differently,’ he says. “It certainly didn’t work 
the way they tried it before.” 

Even some people who say they are intimi- 
dated by Guth acknowledge the effectiveness 
of his pushing, bullying, and cajoling. “He is 
a bit of a bulldozer,’ Wirth says. “But he gets 
things done.” When the sheikh died suddenly 
in 2014 and the future of his Spix’s macaws 
was in doubt, Guth stepped in through his 
NGO to bring the birds from Qatar to Tas- 
dorf. In June 2018, Guth and Brazil’s envi- 
ronment minister signed a memorandum 
of understanding in Berlin to build the facil- 
ity in Brazil, transfer birds, and reintroduce 
them. (ICMBio and the Pairi Daiza Founda- 
tion were also signatories.) 

“I wasn’t born as a conservationist,” Guth 
says. But as he became involved in the re- 


introduction effort, he grew determined 
to prove his critics wrong. “They said, ‘He 
won't be able to breed the birds. I did. They 
said, ‘He won’t send any birds to Brazil? I 
did. They said, ‘He won’t reintroduce the 
birds’ We are doing that.” 

He has put himself in an interesting 
position, Collar says. “He is the one now 
who can go down in history as the person 
who saved the Spix’s macaw. Or if he really 
messes up, then he goes down in history as 
the person who made it go extinct.” 


WHILE OWNERS WERE FIGHTING over control 
and credit, conservationists and research- 
ers were fighting to save the species. Whe 
ornithologist Cromwell Purchase went t 
Al Wabra in 2010 to head its Spix’s maca 
program, he was told the species was “o 
the fence.” At the time, 54 of 71 bird 
known worldwide were in Qatar, an 
the captive population faced twi 
threats: disease and a low birth rate 

The major disease threatening cap] 
tive Spix’s was proventricular dilat 
tion disease, which affects the nerve 
in parrots’ gastrointestinal tract an 
causes them to slowly waste away. 
common scourge of pet birds, it ha 
been known since the 1970s, but it 
cause was completely unclear. The 
in 2008, researchers identified 
novel virus in the brains of affecte 
birds: a type of bornavirus, a grou 
known to cause brain disease i 
horses and sheep. 

“We tested all known Spix’s in th 
world for this virus,’ says Michael 
Lierz, a veterinarian at the Justu 
Liebig University Giessen. In Qata 
a traffic light system was implemente 
with infected birds deemed “red” an 
separated from the others. This eventuall 
eliminated the threat of avian bornaviru 
to the Spix’s population. 

The other problem was reproductio 
Only a few pairs were producing chic 
At first a decision was made to keep the 
reproducing. “The goal was to produce a 
many animals as possible to keep the specie 
from going completely extinct,’ Lierz says. 
Over time the focus shifted to making bet- 
ter matches, in order to preserve the Spix’s 
genetic diversity and, therefore, its chances 
of survival. But birds with diverse genetics 
wouldn’t necessarily form a pair. “Parrots are 
monogamous and very choosy,’ Lierz says. 

So, veterinarians at Al Wabra considered 
artificial insemination. For many birds, 
including chickens, pigeons, and birds of 
prey, this is fairly straightforward, Lierz 
says. The technique involves massaging 
a male’s cloaca from the outside with the 
thumb. (“A short and smooth thumbnail is 
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advantageous for performing cloacal mas- 
sage and protects the bird from accidental 
injury,’ one paper notes.) But this tech- 
nique does not work on most large parrots. 
Around 2010, Lierz and his colleague Daniel 
Neumann developed a new method: insert- 
ing a small probe into the cloaca to deliver a 
weak electric current that stimulates a male 
bird to release sperm. “As kids we used to 
hold these 9-volt batteries to our tongues 
and it tingled, that’s roughly how you have 
to imagine this,” Lierz says. 

With artificial insemination, the re- 
searchers could finally pair birds according 
to their genetics. But the timing was cru- 
cial: Females usually lay two or three eggs 
and the moment one egg is laid is the right 
time to inseminate the next one. Purchase 
says he and Neumann spent hours watching 
female Spix’s macaws on video monitors. 
“As soon as we see the egg, we're out and we 
go from aviary to aviary and we catch the 
male that we want, male No. 1 on the list. 
We try and collect semen from him, and if 
we don’t get enough ... then we go to male 
No. 2,” Purchase says. In May 2013, the first 
artificially inseminated Spix’s macaw chicks 
hatched. More followed. “That’s what got us 
out of the genetic bottleneck,” Bouts says. 


THE MORNING AFTER GUTH was fuming 
about the aviary delay, Purchase walked 
into a large room at the facility carrying a 
gray plastic cage in each hand. He set them 
down on the tiled floor, opened the door of 
one, reached inside with a dark towel, and 
enfolded what was inside. Kneeling on the 
floor, he delicately unwrapped the towel. 
A gray head emerged first, then turquoise 
feathers covering the parrot’s belly, and fi- 
nally the rich blue of its back and tail. 

Purchase carried the bird over to Francois 
Le Grange, a veterinarian, who began to ex- 
amine it—a final check before it would join 
the other candidates for release in the not- 
quite-finished outdoor aviary. The bird’s 
outraged “ca-4 ca-a” echoed off the walls as 
Le Grange plucked a feather from beneath 
its wing. Then he listened to its heartbeat 
with a children’s stethoscope. He swabbed 
the mouth and the cloaca and finally drew 
some blood from a vein in the neck. 

The swabs would be tested for patho- 
gens that might pose a risk to other ani- 
mals after the birds are released. But the 
team is much more worried about the 
dangers these parrots themselves will face 
in the wild. After generations in captivity, 
their instincts for navigating and finding 
food have weakened, White says. There 
are predators, too, including opossums, 
snakes, and birds of prey. And, of course, 
humans—the species that drove the bird to 
extinction in the first place. 
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Together these challenges doomed some 
earlier reintroduction programs. One of the 
highest profile examples was an attempt to 
bring the thick-billed parrot back to Arizona, 
Amato says. This brightly colored bird still 
lives in Mexico but has been hunted to ex- 
tinction in the United States. Between 1986 
and 1993, 88 of them (mostly confiscated 
birds originally trapped illegally in Mexico, 
but also some captive-bred birds) were re- 
leased in the Chiricahua Mountains in Ari- 
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Johann Baptist von Spix first described and painted 
the macaw in an 1824 publication. 


zona. Many were killed by hawks or cats or 
starved to death. After 2 months, only about 
two-thirds of the wild-caught birds survived. 
But the captive-bred birds did much worse, 
as a paper noted in 1994: “Almost all indi- 
viduals have been lost within a few days of 
release as a result of substantial deficiencies 
in basic survival skills.” The program was 
abandoned in 1993, and the last time a thick- 
billed parrot was spotted in Arizona was in 
1995. “The release program was a failure, 
even though a lot of money and effort was 
spent on it,’ Amato says. “After that, many 
biologists felt that release programs for par- 
rots generally were unlikely to be successful.” 

Yet Amato notes some hopeful counter- 
examples: the feral populations of escaped 
parrots that thrive in many parts of the 
world, including London and New York 
City. “These are like accidental reintroduc- 
tions that worked,’ he says. Some recent 
planned reintroductions have also had 
positive results, White says, including one 
he was involved in: the reintroduction of 


Puerto Rican parrots to El Yunque National 
Forest after they were wiped out by Hurri- 
cane Maria in 2017. Since 2020, 75 captive- 
reared animals have been released in the 
forest, which now hosts 34 birds. Four new 
nests were spotted this year, White says. 
“This was a true reintroduction and it has 
been very successful.” 


THAT NIGHT, as darkness descended over the 
caatinga, Lugarini headed out with a col- 
league to a creek near the facility. Wearing 
leather gaiters to protect against snakes, 
she followed the mostly dry creek bed, mov- 
ing as quietly as possible. She stopped in 
front of a caraibeira tree, where a pair o 
Illiger’s macaws had made their nest. 

Illiger’s macaws, also known as blue 
winged macaws, play an important role i 
the plan to bring back the Spix’s. Illiger’ 
are more common and inhabit a larger are 
than the Spix’s macaws, but in the caating 
the two birds’ lifestyles overlap. Both nest i 
hollows in caraibeira trees and feed on th 
same fruits and nuts. When the eight Spix’ 
are released, eight Illiger’s macaws take 
from the wild will be released with the 
The team hopes this mixed flock will joi 
up with wild Iliger’s in the caatinga, allo 
ing the Spix’s macaws to benefit from thei 
knowledge of how to avoid predators, fin 
food, and navigate. 

The team had already collected seve 
Illiger’s, and Lugarini had come for th 
eighth. Her headlamp casting a red glo 
she grasped a cord looped around a branc 
high in the caraibeira tree, fixed a rope t 
it and then used the cord to pull the rop 
over the branch and back down. Lookin 
up, she sighed with apprehension at th 
sight of bats circling the tree. “That’s wors 
than the snakes,” she said. Yet she slowl 
ascended the rope, the red light markin 
her progress. Ten meters up she reached th 
nesting hollow and looked inside. No bird 
The Illiger’s macaws that had been nestin 
here were gone. 

One clear lesson from previous reintro} 
ductions is that releasing more animals i 
better. That’s because a bigger group ca 
work together to spot dangers and fin 
food. Finding a suitable mate is easier, too. 
For highly social species like macaws, num- 
bers are especially important. “Let’s say you 
release 20 individuals and they all go 20 
different directions, well then you haven’t 
reestablished a population,’ White says. 
“They need to live in a group.” Combining 
captive Spix’s and wild Illiger’s thus solves 
two problems, White says. “We can actually 
increase the flock size without extra Spix’s 
... while using a native species which knows 
the habitat, knows the area, that can func- 
tion as mentors.” 
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Veterinarian Francois Le Grange (top right) and animal keeper Sebastian Laurisch examine a Spix’s macaw at the 
breeding station in Germany where these three chicks (bottom) were born. 


Releasing birds of the right age can help 
keep them from scattering. Spix’s macaws 
start to reproduce around age 4 and then 
tend to return to the same nesting site year 
after year. “The sooner that those released 
macaws start reproducing, the sooner they 
become anchored to that site,’ White says. 
“So you want to have birds that are enter- 
ing or at reproductive age.” Providing sup- 
plementary food and nest boxes may also 
encourage the birds to remain close to the 
release site. 

When White and other researchers re- 
viewed 47 releases of captive parrots into 
the wild, they found that the single biggest 
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threat to success was predation. To reduce 
this risk, Purchase put metal bands around 
trees with nest hollows or nest boxes to 
keep predators like opossums from climb- 
ing the trees. To avoid tipping off would-be 
poachers, he put decoy bands around trees 
without nests, as well. The birds will also 
wear tracking collars. 

After a long day, Lugarini headed back to 
her hotel in Curaca. As the four-wheel drive 
vehicle bounced over the dusty road, goats 
scattered and closed wooden gates slowed 
her progress. It was a reminder that the 
Spix’s natural habitat barely exists anymore. 
A restoration project is ongoing but has 


been hampered by the lack of knowledge 
of this little-studied biome and by its loca- 
tion in one of the poorest regions of Brazil, 
where the goats that provide a lifeline for 
the local population have devoured much of 
the natural vegetation. 

“In the beginning I did sometimes 
think, ‘Why are we putting all this effort 
into bringing back one species that is ex- 
tinct when there are so many other species 
that we could still save from extinction?” 
Lugarini says. “But you have to remember 
that this flagship species helps us preserve 
and restore the caatinga, and that helps 
many other species, too.” 

Curaca is home to about 30,00 
inhabitants—and many homages to th 
Spix’s. Next to the gas station is the Spi 
hotel. The theater, restored with mone 
from the Spix project, is bright blue. Th 
city’s flag in front of the town hall include 
a Spix’s macaw, though Lugarini notes “the 
got it wrong”: The bird has the yellow mar! 
ings around the eyes and next to the bea 
that are typical of the Lear’s macaw, anothe 
threatened macaw that lives not far away. 

One resident, Fernando Ferreira, wrot 
the song about the lovesick Illiger’s maca 
Wearing shorts and a T-shirt, his gray hai 
swept back in a ponytail, Ferreira sat dow 
with a guitar and sang another song h 
wrote about the Spix’s macaw, known her 
as ararinha azul, or little blue macaw: “M 
wish is to see you fly, my wish is to see yo 
come back,” he sang. On the afternoon o| 
11 June, Ferreira will perform this song al 
a ceremony at the theater. There will be 
video, speeches, and a press conferenc 
Earlier that day, in front of a small group o| 
people, Purchase will open the door of th 
aviary to release the birds. 

For those who have worked toward thi 
for years, it will be a moment of joy an 
apprehension. “It will feel like a weigh 
off my shoulders, probably,” Purchase say: 
But then comes the next weight—worryin 
about their survival. There is an element o| 
guilt, Miyaki says, because humans drov 
the Spix’s to extinction. “We owe it to th 
species, for it to go back to the wild.” But th 
experience of 1995 still casts a shadow, she 
says. “The frustration after the first release 
of that female was so big,” she says. “I try to 
be optimistic, but I’m very anxious.” 

The project estimates that between one- 
third and two-thirds of the birds will be 
lost in the first year. If the losses are higher, 
the birds may be taken back in. “You try to 
make sure that you have covered all of the 
bases and thought about as many possible 
options and outcomes as possible,’ White 
says. “But the day you release those birds, 
the day they leave that cage, a lot of things 
are no longer within your control.” 
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Quantum learning unravels quantum system 


A quantum computer has a decisive advantage in analyzing quantum experiment results 


By Vedran Dunjko 


n the early 1980s, American physicist 
Richard Feynman proposed that ma- 
chines can exploit quantum phenomena 
to perform otherwise intractable com- 
putations. The kind of computation he 
envisioned was broadly about simulating 
the properties of a quantum system given 
its classical description. In the decades that 
followed, researchers have identified numer- 
ous other problems that in theory can only 
be solved within a reasonable time frame by 
using such a quantum computer. However, 
a quantum computer that can exercise this 
advantage over a classical computer does 
not exist yet. A recent hope is that near-term 
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quantum computers may be used as a new 
type of machine learning device that offers 
an edge in analyzing data from quantum ex- 
periments. On page 1182 of this issue, Huang 
et al. (1) present an experimental realization 
of a quantum learning algorithm that has 
a provable advantage over its conventional 
counterpart while being within the reach of 
today’s quantum computers. 

The signature capacity of a quantum com- 
puter is the ability to predict the behavior of 
amany-particle quantum system when given 
its initial condition. With the development 
of machine learning methods, even more 
complex questions can be asked. Machine 
learning can enable such prediction even 
without full knowledge of the system, but 
with merely having access to previous ex- 
perimental data. The task can be thought 
of as a two-stage process. The computer is 
fed a dataset that stems from some previ- 


ous experiments over a quantum system. 
Then, the classical or quantum computer 
will have to predict the future of the system 
under slightly different settings. Intuitively, 
such data analysis may inherit the so-called 
quantum-classical performance gap, as de- 
scribed by Feynman: that the computation 
of the future state of the system, given its 
full description, is intractable for classical 
but feasible for quantum computers. 
However, unexpectedly, the inclusion of 
training data could close this gap. Classical 
machine learning can sometimes predict 
properties of complex quantum systems (2), 
so it is unclear whether quantum comput- 
ers hold an edge in this setup. Huang e¢ al. 
propose an approach that will give quantum 
computers a decisive edge: by leveraging 
the quantum computer’s ability to process 
“quantum data,” raw quantum states that 
result from a quantum experiment and not 
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By harnessing the power of the Google Sycamore 
processor (pictured here), Huang et al. showcase the 
exponential advantages offered by quantum computers 
for analyzing data from quantum experiments. 


mere classical information. The quantum 
data can be used by the quantum computer 
to predict future results while requiring far 
fewer experiments. 

One cannot input quantum data into 
a classical computer. Intuitively, one may 
think that this would give the quantum 
device a straightforward advantage. But 
the actual scenario is more nuanced. For 
a classical setup, the quantum state of an 
experiment can be measured and used as 
input, with each measurement freely cho- 
sen by the classical learning algorithm. 
Because the classical computer can arbi- 
trarily choose when to measure each of the 
experiments, then at least in principle, all 
the information encoded in quantum states 
can be accessible. For a quantum setup, the 
quantum computer provides a minimal but 
key additional capacity: a small quantum 
memory that enables joint measurements 
on two copies of quantum data. 

So in both cases, all the quantum data 
are converted to classical information be- 
fore calculation, but in slightly different 
manners. Joint measurements—used in the 
quantum-enhanced scenario—unravel cor- 
related properties of two separate quantum 
systems. This fundamentally exploits quan- 
tum entanglement (when the quantum states 
of two or more objects are intertwined with 
each other) and cannot be substituted by 
pairs of individual measurements. Since the 
early days of quantum information theory, 
it has been known that joint measurements 
can help distinguish quantum states, even 
when the states are uncorrelated (3). But 
until recently, it was not clear just how large 
an advantage this exploit can give quantum 
computers over their classical counterparts. 

Building from the research line on so- 
called shadow tomography (4-6), Huang et 
al. argued that joint measurements lead to 
substantial advantages for learning about 
quantum systems. Namely, the quantum- 
enhanced strategy is exponentially more 
economical in terms of the number of quan- 
tum experiments needed for predicting the 
outcomes of just two measurements (6). 
The authors demonstrated the advantages 
of a quantum learning experiment using 
the Google Sycamore processor. The natural 
scenario of quantum data learning involves 
a “transducer” that transports the quantum 
state of results from an experiment into the 
quantum computer. Their experiment was 
simulated in the same quantum processor 
that analyzes the data, in a lab-on-a-chip set- 
ting. Once the quantum state is prepared, 


SCIENCE science.org 


it is analyzed with classical and quantum- 
enhanced methods. 

For the optimal predictions, the exact 
joint measurements may be known, at least 
in idealized settings. However, in the real 
experiment, the state preparation is im- 
perfect, as is the measurement performed. 
To counteract this, the quantum process- 
ing is supplemented with classical machine 
learning to extract the strongest signals in 
the presence of experimental errors. This 
classical-quantum hybrid approach demon- 
strates advantages in our capacity to learn 
various fundamental properties of quantum 
systems—for example, predicting whether 
an unknown quantum process satisfies time- 
reversal symmetry. Their tests show that 
quantum computers can maintain their ad- 
vantages in solving certain problems, even 
when errors specific to quantum computers 
are taken into account. 

The work of Huang et al. intertwines the 
ability to characterize quantum systems (4- 
6) with machine learning, with implications 
for near-term quantum computers and per- 
haps even quantum sensing. The introduced 
generalization of classical machine learning 
to allow quantum data as inputs allows for 
certain benefits; namely, difficult proofs of 
advantages of quantum computers become 
easier. However, because of the hardware 
required to transfer quantum data in its un- 
perturbed state from an experiment into the 
quantum computer, this method may be dif- 
ficult to implement in certain settings, such 
as the high-energy physics experiments at 
the Large Hadron Collider. In smaller-scale 
experiments, however, transduction may 
be reasonable—for example, in quantum- 
optical experiments with nitrogen-vacancy 
centers in diamonds (7), which are often de- 
signed with transporting quantum informa- 
tion in mind. In a related vein, this work also 
opens a frontier for quantum sensing that 
involves quantum states and may lead to 
better advantages (8). Huang et al. proved in 
detail that for the data-driven prediction of 
properties of quantum experiments, no clas- 
sical computer will ever pose a challenge to 
quantum ones—and that quantum comput- 
ers may soon help expand human knowledge 
into new echelons. 


REFERENCES AND NOTES 


H.-Y. Huang et al., Science 376, 1182 (2022). 

H.Y. Huang et al., Nat. Commun. 12,2631 (2021). 

. C.H. Bennett et al., Phys. Rev.A 59,1070 (1999). 

. §.Aaronson, Proc. R. Soc. London A 463, 3089 (2007). 

. §.Aaronson, STOC 2018: Proceedings of the 50th Annual 
ACM SIGACT Symposium on Theory of Computing 
10.1145/3188745.3188802 (2018). 

6. S.Chen, J. Cotler, H.-Y. Huang, J. Li, 2021 IEEE 62nd 
Annual Symposium on Foundations of Computer Science 
(FOCS) (IEEE, 2022), pp. 574-585. 

7. M.Rufetal., J. Appl. Phys.130,070901 (2021). 

8. C.L.Degen, F. Reinhard, P. Cappellaro, Rev. Mod. Phys. 

89, 035002 (2017). 


ORwWNE 


10.1126/science.abp9885 


QUANTUM COMPUTATION 


Solving a 
puzzle with 
atomic qubits 


A quantum computer makes 
light work of the maximum 
independent set problem 


By Monika Schleier-Smith 


magine that you are asked to color a 
map of the world. Starting with your 
favorite color, you endeavor to fill in 
as many countries as possible with- 
out giving any neighboring countries 
the same color. This puzzle, despite its 
straightforward premise, is notorious for its 
computational complexity. On page 1209 of 
this issue, Ebadi et al. (1) report a quantum 
algorithm for solving the puzzle—known 
as the maximum independent set (MIS) 
problem—using individual atoms trapped 
in optical tweezers to represent the coun- 
tries on the map. The demonstration is an 
important milestone in the broad effort to 
understand which computational problems 
stand to benefit from quantum computers. 
To date, only a few quantum algorithms 
have been proven to offer clear advantages 
over classical computers. Moreover, even in 
cases where quantum computers theoreti- 
cally provide a benefit—such as for factor- 
ing large numbers—practical applications 
will require major advances in quantum 
hardware beyond the current state of the 
art. By contrast, the coloring puzzle pre- 
sented by Ebadi et al. belongs to a large 
class of optimization problems (2) that are 
potentially easier to solve using near-term 
quantum devices (3) but for which the at- 
tainable quantum speedup remains largely 
an open question (4—6). Such optimization 
problems, with technological relevance in 
areas such as supply chain logistics, can ge- 
nerically be framed as minimizing what is 
known as a cost function. The solution can 
be calculated by tasking the quantum com- 
puter to minimize the energy of a system 
of interacting particles or qubits, where the 
specific problem is encoded in the structure 
of the interactions. 
To generate the structure of interactions 
required to represent MIS problems, Ebadi 
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By harnessing the power of the Google Sycamore 
processor (pictured here), Huang et al. showcase the 
exponential advantages offered by quantum computers 
for analyzing data from quantum experiments. 


mere classical information. The quantum 
data can be used by the quantum computer 
to predict future results while requiring far 
fewer experiments. 

One cannot input quantum data into 
a classical computer. Intuitively, one may 
think that this would give the quantum 
device a straightforward advantage. But 
the actual scenario is more nuanced. For 
a classical setup, the quantum state of an 
experiment can be measured and used as 
input, with each measurement freely cho- 
sen by the classical learning algorithm. 
Because the classical computer can arbi- 
trarily choose when to measure each of the 
experiments, then at least in principle, all 
the information encoded in quantum states 
can be accessible. For a quantum setup, the 
quantum computer provides a minimal but 
key additional capacity: a small quantum 
memory that enables joint measurements 
on two copies of quantum data. 

So in both cases, all the quantum data 
are converted to classical information be- 
fore calculation, but in slightly different 
manners. Joint measurements—used in the 
quantum-enhanced scenario—unravel cor- 
related properties of two separate quantum 
systems. This fundamentally exploits quan- 
tum entanglement (when the quantum states 
of two or more objects are intertwined with 
each other) and cannot be substituted by 
pairs of individual measurements. Since the 
early days of quantum information theory, 
it has been known that joint measurements 
can help distinguish quantum states, even 
when the states are uncorrelated (3). But 
until recently, it was not clear just how large 
an advantage this exploit can give quantum 
computers over their classical counterparts. 

Building from the research line on so- 
called shadow tomography (4-6), Huang et 
al. argued that joint measurements lead to 
substantial advantages for learning about 
quantum systems. Namely, the quantum- 
enhanced strategy is exponentially more 
economical in terms of the number of quan- 
tum experiments needed for predicting the 
outcomes of just two measurements (6). 
The authors demonstrated the advantages 
of a quantum learning experiment using 
the Google Sycamore processor. The natural 
scenario of quantum data learning involves 
a “transducer” that transports the quantum 
state of results from an experiment into the 
quantum computer. Their experiment was 
simulated in the same quantum processor 
that analyzes the data, in a lab-on-a-chip set- 
ting. Once the quantum state is prepared, 
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it is analyzed with classical and quantum- 
enhanced methods. 

For the optimal predictions, the exact 
joint measurements may be known, at least 
in idealized settings. However, in the real 
experiment, the state preparation is im- 
perfect, as is the measurement performed. 
To counteract this, the quantum process- 
ing is supplemented with classical machine 
learning to extract the strongest signals in 
the presence of experimental errors. This 
classical-quantum hybrid approach demon- 
strates advantages in our capacity to learn 
various fundamental properties of quantum 
systems—for example, predicting whether 
an unknown quantum process satisfies time- 
reversal symmetry. Their tests show that 
quantum computers can maintain their ad- 
vantages in solving certain problems, even 
when errors specific to quantum computers 
are taken into account. 

The work of Huang et al. intertwines the 
ability to characterize quantum systems (4- 
6) with machine learning, with implications 
for near-term quantum computers and per- 
haps even quantum sensing. The introduced 
generalization of classical machine learning 
to allow quantum data as inputs allows for 
certain benefits; namely, difficult proofs of 
advantages of quantum computers become 
easier. However, because of the hardware 
required to transfer quantum data in its un- 
perturbed state from an experiment into the 
quantum computer, this method may be dif- 
ficult to implement in certain settings, such 
as the high-energy physics experiments at 
the Large Hadron Collider. In smaller-scale 
experiments, however, transduction may 
be reasonable—for example, in quantum- 
optical experiments with nitrogen-vacancy 
centers in diamonds (7), which are often de- 
signed with transporting quantum informa- 
tion in mind. In a related vein, this work also 
opens a frontier for quantum sensing that 
involves quantum states and may lead to 
better advantages (8). Huang et al. proved in 
detail that for the data-driven prediction of 
properties of quantum experiments, no clas- 
sical computer will ever pose a challenge to 
quantum ones—and that quantum comput- 
ers may soon help expand human knowledge 
into new echelons. 
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magine that you are asked to color a 
map of the world. Starting with your 
favorite color, you endeavor to fill in 
as many countries as possible with- 
out giving any neighboring countries 
the same color. This puzzle, despite its 
straightforward premise, is notorious for its 
computational complexity. On page 1209 of 
this issue, Ebadi et al. (1) report a quantum 
algorithm for solving the puzzle—known 
as the maximum independent set (MIS) 
problem—using individual atoms trapped 
in optical tweezers to represent the coun- 
tries on the map. The demonstration is an 
important milestone in the broad effort to 
understand which computational problems 
stand to benefit from quantum computers. 
To date, only a few quantum algorithms 
have been proven to offer clear advantages 
over classical computers. Moreover, even in 
cases where quantum computers theoreti- 
cally provide a benefit—such as for factor- 
ing large numbers—practical applications 
will require major advances in quantum 
hardware beyond the current state of the 
art. By contrast, the coloring puzzle pre- 
sented by Ebadi et al. belongs to a large 
class of optimization problems (2) that are 
potentially easier to solve using near-term 
quantum devices (3) but for which the at- 
tainable quantum speedup remains largely 
an open question (4—6). Such optimization 
problems, with technological relevance in 
areas such as supply chain logistics, can ge- 
nerically be framed as minimizing what is 
known as a cost function. The solution can 
be calculated by tasking the quantum com- 
puter to minimize the energy of a system 
of interacting particles or qubits, where the 
specific problem is encoded in the structure 
of the interactions. 
To generate the structure of interactions 
required to represent MIS problems, Ebadi 
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et al. used qubits encoded in the internal 
states of optically trapped atoms. Each 
atom can either be in the electronic ground 
state or a highly excited state known as a 
Rydberg state, where the electron cloud 
is thousands of times larger than in the 
ground state. An atom can be excited to 
the Rydberg state by a laser. However, 
any attempt to excite multiple neighbor- 
ing atoms is constrained by strong interac- 
tions between atoms in the Rydberg state. 
Specifically, no more than a single atom 
can be excited within a minimum distance 
known as the blockade radius (7-9). Thus, 
atoms that are closer to one another than 
the blockade radius are equivalent to coun- 
tries that share a border, where only one 
but not both can be colored blue, or in this 
case, be excited to the Rydberg state. 

The method was put to test using a quan- 
tum processor with up to 289 atomic qubits, 
with each qubit trapped at the focus of a la- 
ser beam. By controlling the positions of the 
atoms, Ebadi et al. programmed specific in- 
stances of the MIS problem, each of which 
can be visualized as a graph with an atom 
at each node and with bonds between block- 
aded pairs (see the figure). They sought to 
solve the problem using an approach known 
as adiabatic quantum computation (4, 10). 
Here, the system parameters are ramped 
from an initial state in which the minimum- 
energy configuration is simple and known 
to a final state where the minimum-energy 
configuration provides a solution for the MIS 
problem. In the laser-driven atomic system, 
depending on whether the photon energy is 
lower or higher than the energy of a Rydberg 
excitation, the minimum-energy configura- 
tion can either have all atoms in the ground 
state or have as many atoms in the Rydberg 
state as possible without violating the block- 
ade constraint. Thus, by ramping the fre- 
quency and intensity of the lasers, the atoms 
are driven from their initial ground states 
into a configuration of Rydberg excitations 
that, ideally, forms an MIS. 

The key to this method is the mainte- 
nance of adiabaticity within the system—to 
ensure that the quantum system remains 
in its lowest-energy state throughout the 
ramping process. As an analogy, think of 
a waiter delivering an ice cream float to a 
diner. If the waiter moves too quickly, the 
drink may spill out, yet if he moves too 
slowly, the ice cream will melt—both of 
which are undesirable from the perspective 
of the diner. Similarly, in the quantum ex- 
periment, the system parameters must in- 
crease slowly enough for the atoms to settle 
into the MIS and yet fast enough for the 
quantum system to maintain its coherence, 
which is ultimately limited by the lifetime 
of the Rydberg states. 
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Coloring a map witha 


quantum computer 

How many blue regions can this map have without 
any of them sharing a border? Instead of examining 
all the possibilities classically, Ebadi et al. solved the 
puzzle by using a quantum computer, composed 

of individual atoms that can only be excited (shown 
as @) if all neighboring atoms are in their ground 
states (). The map puzzle is encoded as a 
network of nodes, which represent the regions, and 
connections, which represent the shared borders. 
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A crucial question is whether this quan- 
tum algorithm provides a speedup over clas- 
sical approaches. State-of-the-art classical 
algorithms employ a strategy known as sim- 
ulated annealing, which mimics a physical 
process of preparing the interacting system 
at a high temperature and gradually reduc- 
ing the temperature to reach the lowest- 
energy state. In practice, neither the quan- 
tum nor the classical algorithm always suc- 
ceeds in finding the optimal solution. Thus, 
a figure of merit for the performance of the 
algorithm is the average number of times 
(¢) that the algorithm would need to be run 
to succeed in finding the MIS. For the clas- 
sical approach, this number of iterations ¢,, 
is proportional to the ratio of the number of 
near-perfect solutions to the number of per- 
fect solutions, where a near-perfect solution 
is defined as having one fewer “country” in 
its set than a perfect solution. 

By contrast, the performance of the 
quantum algorithm not only depended 
on how many near-perfect solutions there 
are for every perfect one, it also depended 
on the gap in energy between the lowest- 
energy state and the first excited state. The 
smaller this gap, the slower a ramp one 
theoretically expects to require for the sys- 
tem to remain in its lowest-energy state to 
reach the perfect solutions. In cases where 


a sufficiently slow ramp could be per- 
formed within the time scale of the experi- 
ment, the quantum algorithm provided a 
speedup compared with the classical one. 
Specifically, the number of tries required 
by the quantum algorithm to solve the MIS 
problem scaled as the three-fifths power 
of the number of tries ¢,, required classi- 
cally, meaning that if the MIS problem is 
made more difficult such that the time re- 
quired to solve it classically increases, for 
example, by a factor of 32, then the time re- 
quired by the quantum computer will only 
increase by a factor of 8. 

Although the experiment by Ebadi et al. 
is not the first to explore quantum optimi- 
zation algorithms (J7—/3), it stands out for 
operating both with a large number of qu- 
bits and with sufficiently coherent interac- 
tions for quantum information to spread 
across the entire system within the time 
scale of the experiment (14). This combina- 
tion appears to be crucial for the observed 
quantum speedup. 

An important question for future work 
is whether the improved scaling of the 
quantum algorithm persists as the dif- 
ficulty of the problems is increased, for 
example, by increasing the number of qu- 
bits. A potential challenge is that the gap 
in energy separating perfect from near- 
perfect solutions is expected to shrink as 
the number of qubits grows (3, 17), placing 
ever more stringent demands on how slowly 
the system parameters must be swept, and 
hence also on the coherence time of the 
experiment. One hope is to adopt the ap- 
proach of an experienced waiter who moves 
so quickly that the drink begins to slosh, 
but ultimately executes just the right mo- 
tions to bring it back to rest (5). Finding 
the right motions in a complex quantum 
system is bound to be a challenge, offering 
fertile ground for future research. 
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Phosphorus through the looking glass 


A key building block enables a general synthesis of chiral phosphorus drugs 


By Xavier Verdaguer!? 


n the book Alice’s Adventures in 

Wonderland, Alice suspected that the 

milk on the other side of the look- 

ing glass might not be good for the 

cat to drink. Perhaps unbeknownst to 

the writer Lewis Carroll, this subplot 
reflects an important aspect of chemis- 
try—that the mirror image of a molecule 
is not necessarily the same as the origi- 
nal because of a feature known as chiral- 
ity. Just like our left and right hands, the 
mirrored image of a chiral molecule is not 
superimposable to the original. On page 
1230 of this issue, Forbes and Jacobsen (1) 
describe an approach for the selective syn- 
thesis of a single mirror image of 
chiral pharmaceutical compounds, 
the chirality of which is originated 
at a phosphorus atom. The ap- 
proach may make the synthesis of 
highly desired chiral drugs much 
more attainable. 

Chirality is essential to life. 
Amino acids and sugars are chi- 
ral and are found as single right- 
handed or left-handed isomers. 
Therefore, proteins and most bio- 
molecules are intrinsically chiral. 
Chirality in molecules most of- 
ten originates from carbon atoms 
with four different substituents 
in a tetrahedral disposition, with 
the central atom labeled as “ste- 
reogenic” or as a “stereocenter.” 
After chemists Jean-Baptiste Biot 
and Louis Pasteur discovered and 
rationalized the phenomenon of 
chirality in organic compounds, 
others soon realized that elements 
beyond carbon, like phosphorus, 
could also be stereogenic, that 
is, P-stereogenic. In 1911, chem- 
ists Jakob Meisenheimer and Leo 
Lichtenstadt were the first to sep- 
arate a P-stereogenic compound 
into its individual mirror isomers 
(2). Although this work initially 
seemed like nothing more than 
a scientific curiosity, it was a few 
decades later when P-stereogenic 
phosphorus ligands found appli- 
cations in the industrial synthesis 
of levodopa (L-dopa), a compound 
for treating Parkinson’s disease 
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These symbols represent different 
chemical groups contained in medicines. 


(3). William S. Knowles received the 2001 
Nobel Prize in Chemistry for this discovery. 

Nowadays, P-stereogenic compounds 
have found their way into numerous drugs. 
To make antitumoral and antiviral treat- 
ments more effective, nucleoside drugs 
are often introduced into the body as pro- 
drugs, which are medications that become 
active only after entering the patient, and 
are used to help improve a medication’s 
effectiveness. These phosphorus prodrugs 
contain a P-stereogenic atom, as in teno- 
fovir alafenamide (4). Cyclic dinucleotides 
like cyclic guanosine monophosphate-ad- 
enosine monophosphate (cGAMP), which 
are promising candidates for cancer treat- 
ments, contain two P-stereogenic phos- 


Single-handed phosphorus drugs 
Merck's strategy uses a 50/50 mixture of right- and left-handed starting 
material. This method is highly substrate specific. Forbes and Jacobsen 
use a nonchiral starting material. The key single-handed building block (1) 
allows for the synthesis of multiple chiral phosphorus drugs through 
sequential substitution. 
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phorus atoms (5). The synthesis of such 
phosphorus derivatives in a precise and 
stereocontrolled manner is of vital impor- 
tance because it will determine the perfor- 
mance of the final drug. 

Until very recently, the synthesis of 
single-handed P-stereogenic compounds 
relied on the use of other chiral molecules 
known as stoichiometric chiral auxilia- 
ries. These are initially attached to the 
phosphorus atom but must be discarded 
by the end of the synthesis (6, 7). An ini- 
tial breakthrough in this field came in 
2017 when chemists at Merck developed 
a highly specific catalyst for the synthesis 
of a P-stereogenic prodrug compound us- 
ing a dynamic kinetic resolution strategy 
(8). Here, a mixture of equal parts 
left-handed and_ right-handed 
starting material was used. The 
catalyst attaches to the starting 
material and exerts two functions: 
It provides a bias for the reaction 
with the nucleoside such that the 
right isomer reacts faster than the 
left one, and it interconverts the 
two mirror isomers, allowing the 
nonreacted left isomer to be con- 
verted to the desired right-handed 
product. Very recently, a similar 
strategy was used in the P(III) 
phosphoramidite coupling of oli- 
gonucleotides using a chiral phos- 
phoric acid catalyst. The approach 
allowed the stereocontrolled syn- 
thesis of cyclic dinucleotides like 
cGAMP (9). However, both meth- 
ods are substrate specific and do 
not serve as a general strategy 
for the synthesis of P-stereogenic 
compounds. 

Forbes and Jacobsen describe an 
alternative and general approach 
based on a symmetrical—that is, 
nonchiral—starting material that 
is modified to introduce chirality 
by using a so-called desymmetriza- 
tion reaction (see the figure). The 
authors used a starting material 
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with two chlorides. The catalyst, a urea- 
sulfinamide, promoted the substitution 
of only one of the chlorides by an amine 
to yield a highly enriched single-handed 
P-stereogenic chlorophosphonamide. This 
is a highly versatile P-stereogenic build- 
ing block because the chloride group and 
the amino group have the orthogonal re- 
activity necessary for the stereocontrolled 
sequential substitution of the two groups. 
Basic reagents react with inversion of con- 
figuration with the chloride, whereas the 
amine can be substituted under acidic 
conditions for alcohols. This approach al- 
lows the synthesis of a large variety of 
P-stereogenic compounds, including phos- 
phonates, phosphinates, phosphonami- 
dates, and phosphonate thioesters. The 
potential of this methodology is demon- 
strated by Forbes and Jacobsen in the syn- 
thesis of a few P-stereogenic drugs. 
Phosphine chloride compounds are 
highly reactive. To harness the reactivity 
of dichlorophosphiny] derivatives in a de- 
symmetrization reaction is an impressive 
achievement. However, this success comes 
with a caveat. The versatile chlorophosph- 
inamide building block cannot be isolated 
in pure form because it is prone to racemi- 
zation, that is, an equilibration to a 50/50 
mixture of right- and left-handed isomers. 
This hampers the enrichment of the isomer 


“This approach allows the 
synthesis of a large variety of 
P-stereogenic compounds...” 


of interest and ultimately might limit the 
use of this method. Future developments 
in the catalytic synthesis of P-stereogenic 
compounds should seek methods with im- 
proved selectivity and stable P-stereogenic 
intermediates that can compete with state- 
of-the-art traditional chiral auxiliary ap- 
proaches. The work of Forbes and Jacobsen 
will encourage chemists to work on further 
developments and to explore the wonders 
that await us beyond the looking glass. 
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Solving 
the nuclear 
pore puzzle 


the architecture of 
the nuclear pore complex 
is revealed 


Using a battery of tools, ged » 8, 


By Thomas U. Schwartz 


n eukaryotic cells, the genome is seques- 
tered in the nucleus, shielded from the 
cytoplasm by the double-layered nuclear 
envelope (NE). Transport of macromol- 
ecules across the NE occurs through 
nuclear pore complexes (NPCs), which 
perforate the NE at ~200 to 2000 positions 
(7-3). Ions and molecules up to ~40 kDa dif- 
fuse through NPCs, whereas larger cargo se- 
lectively associate with soluble nuclear trans- 
port factors to be ferried through the central 
NPC channel (4). But it has been unclear how 
NPCs exactly control the transport of a vast 
array of different substrates, including solu- 
ble proteins, embedded membrane proteins, 
RNAs, and even some viral capsids. On pages 
1174, 1175, 1176, 1177, and 1178 of this issue, Bley 
et al. (5), Petrovic et al. (6), Mosalaganti et al. 
(7), Zhu et al. (8), and Fontana et al. (9), re- 
spectively, now provide molecular structures, 
in unprecedented detail, of how NPCs are 
built. These findings will enable approaches 
to further dissect the many NPC functions. 
With an estimated size of 60 to 120 MDa, 
depending on the eukaryotic species, NPCs 
are elaborate protein assemblies. The 600 to 
1000 individual proteins, collectively called 
nucleoporins (NUPs), are organized around 
a central eightfold rotational symmetry into 
protomers (also sometimes referred to as 
“spokes”). Protomer segments build four con- 
centric rings, referencing their position rela- 
tive to the pore opening in the NE: cytoplas- 
mic ring (CR), inner ring (IR), nucleoplasmic 
ring (NR), and luminal ring (LR). Further, 
each protomer consists of subcomplexes, 
which are defined as biochemically stable 
entities into which NPCs disassemble dur- 
ing cell division and NE breakdown. About 
half of the core scaffold, which encompasses 
~25 different NUPs, has previously been po- 
sitioned within the NPC structure, primarily 
through a combination of x-ray structural 
analysis of individual subcomplexes and 


cryo-electron tomography (cryo-ET) recon- 
structions (or maps) of the entire NPC (/-3). 
The papers in this issue now fill many of the 
remaining gaps. 

Starting from an established protocol 
for the cryo-ET analysis of human NPCs, 
Mosalaganti et al. improved the resolution of 
the CR and IR from ~2.3 to ~1.2 nm. In addi- 
tion, the authors used artificial intelligence- 
based structure prediction and subcomplex 
modeling to interpret the improved cryo-ET 
map. The NPC scaffold also serves as the an- 
chor for phenylalanine-glycine (FG) repeats 
that extend their fibrillar extensions into 
the central channel of the NPC to form the 
principal transport barrier (J0). Mosalaganti 
et al. can now position, in addition to the 
central NUP62-NUP58-NUP54 complex, the 
second main FG-containing assembly—the 
CR-attached NUP62-NUP88-NUP214 com- 
plex—completing the anchor points for the 
FG network. 

The same cryo-ET map was also inter- 
preted by Petrovic et al. and Bley et al.; these 
authors used extensive experimental data 
to support the fitting of experimental struc- 
tures. Petrovic et al. focus on linker NUPs 
that connect subcomplexes within the NPC. 
They establish how the large, stacked helical 
proteins NUP188 and NUP205 interact com- 
petitively with NUP93. Together with a num- 
ber of weaker interaction motifs, the authors 
provide a linker map of the IR. 

Bley et al. focus on the CR and specifically 
on the attachment of so-called cytoplasmic 
filaments, which are particularly important 
for mRNA export. NUP358 is an extended, 
multidomain protein, and five copies an- 
chor to the CR through the amino-terminal 
75 kDa a-solenoid element, solved by Bley et 
al. by using x-ray crystallography. The flex- 
ibly linked NUP93-NUP205 pair, hitherto 
considered a component of the IR, is also 
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with two chlorides. The catalyst, a urea- 
sulfinamide, promoted the substitution 
of only one of the chlorides by an amine 
to yield a highly enriched single-handed 
P-stereogenic chlorophosphonamide. This 
is a highly versatile P-stereogenic build- 
ing block because the chloride group and 
the amino group have the orthogonal re- 
activity necessary for the stereocontrolled 
sequential substitution of the two groups. 
Basic reagents react with inversion of con- 
figuration with the chloride, whereas the 
amine can be substituted under acidic 
conditions for alcohols. This approach al- 
lows the synthesis of a large variety of 
P-stereogenic compounds, including phos- 
phonates, phosphinates, phosphonami- 
dates, and phosphonate thioesters. The 
potential of this methodology is demon- 
strated by Forbes and Jacobsen in the syn- 
thesis of a few P-stereogenic drugs. 
Phosphine chloride compounds are 
highly reactive. To harness the reactivity 
of dichlorophosphiny] derivatives in a de- 
symmetrization reaction is an impressive 
achievement. However, this success comes 
with a caveat. The versatile chlorophosph- 
inamide building block cannot be isolated 
in pure form because it is prone to racemi- 
zation, that is, an equilibration to a 50/50 
mixture of right- and left-handed isomers. 
This hampers the enrichment of the isomer 


“This approach allows the 
synthesis of a large variety of 
P-stereogenic compounds...” 


of interest and ultimately might limit the 
use of this method. Future developments 
in the catalytic synthesis of P-stereogenic 
compounds should seek methods with im- 
proved selectivity and stable P-stereogenic 
intermediates that can compete with state- 
of-the-art traditional chiral auxiliary ap- 
proaches. The work of Forbes and Jacobsen 
will encourage chemists to work on further 
developments and to explore the wonders 
that await us beyond the looking glass. 
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n eukaryotic cells, the genome is seques- 
tered in the nucleus, shielded from the 
cytoplasm by the double-layered nuclear 
envelope (NE). Transport of macromol- 
ecules across the NE occurs through 
nuclear pore complexes (NPCs), which 
perforate the NE at ~200 to 2000 positions 
(7-3). Ions and molecules up to ~40 kDa dif- 
fuse through NPCs, whereas larger cargo se- 
lectively associate with soluble nuclear trans- 
port factors to be ferried through the central 
NPC channel (4). But it has been unclear how 
NPCs exactly control the transport of a vast 
array of different substrates, including solu- 
ble proteins, embedded membrane proteins, 
RNAs, and even some viral capsids. On pages 
1174, 1175, 1176, 1177, and 1178 of this issue, Bley 
et al. (5), Petrovic et al. (6), Mosalaganti et al. 
(7), Zhu et al. (8), and Fontana et al. (9), re- 
spectively, now provide molecular structures, 
in unprecedented detail, of how NPCs are 
built. These findings will enable approaches 
to further dissect the many NPC functions. 
With an estimated size of 60 to 120 MDa, 
depending on the eukaryotic species, NPCs 
are elaborate protein assemblies. The 600 to 
1000 individual proteins, collectively called 
nucleoporins (NUPs), are organized around 
a central eightfold rotational symmetry into 
protomers (also sometimes referred to as 
“spokes”). Protomer segments build four con- 
centric rings, referencing their position rela- 
tive to the pore opening in the NE: cytoplas- 
mic ring (CR), inner ring (IR), nucleoplasmic 
ring (NR), and luminal ring (LR). Further, 
each protomer consists of subcomplexes, 
which are defined as biochemically stable 
entities into which NPCs disassemble dur- 
ing cell division and NE breakdown. About 
half of the core scaffold, which encompasses 
~25 different NUPs, has previously been po- 
sitioned within the NPC structure, primarily 
through a combination of x-ray structural 
analysis of individual subcomplexes and 


cryo-electron tomography (cryo-ET) recon- 
structions (or maps) of the entire NPC (/-3). 
The papers in this issue now fill many of the 
remaining gaps. 

Starting from an established protocol 
for the cryo-ET analysis of human NPCs, 
Mosalaganti et al. improved the resolution of 
the CR and IR from ~2.3 to ~1.2 nm. In addi- 
tion, the authors used artificial intelligence- 
based structure prediction and subcomplex 
modeling to interpret the improved cryo-ET 
map. The NPC scaffold also serves as the an- 
chor for phenylalanine-glycine (FG) repeats 
that extend their fibrillar extensions into 
the central channel of the NPC to form the 
principal transport barrier (J0). Mosalaganti 
et al. can now position, in addition to the 
central NUP62-NUP58-NUP54 complex, the 
second main FG-containing assembly—the 
CR-attached NUP62-NUP88-NUP214 com- 
plex—completing the anchor points for the 
FG network. 

The same cryo-ET map was also inter- 
preted by Petrovic et al. and Bley et al.; these 
authors used extensive experimental data 
to support the fitting of experimental struc- 
tures. Petrovic et al. focus on linker NUPs 
that connect subcomplexes within the NPC. 
They establish how the large, stacked helical 
proteins NUP188 and NUP205 interact com- 
petitively with NUP93. Together with a num- 
ber of weaker interaction motifs, the authors 
provide a linker map of the IR. 

Bley et al. focus on the CR and specifically 
on the attachment of so-called cytoplasmic 
filaments, which are particularly important 
for mRNA export. NUP358 is an extended, 
multidomain protein, and five copies an- 
chor to the CR through the amino-terminal 
75 kDa a-solenoid element, solved by Bley et 
al. by using x-ray crystallography. The flex- 
ibly linked NUP93-NUP205 pair, hitherto 
considered a component of the IR, is also 
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Newly resolved components of the human nuclear 
pore complex include the luminal ring (orange), 
cytoplasmic filaments (yellow), and lipid membrane 
(white). The structure known in 2016 is in blue. 


positioned in two locations of the CR and 
one position of the NR, in entirely different 
environments than in the IR. One rationale 
for this moonlighting function, at least in 
the CR, is the role of NUP93 in anchoring 
the cytoplasmic NUP62-NUP88-NUP214 
FG-containing complex, which is analogous 
to the central NUP62-NUP58-NUP54 FG- 
containing complex in the IR. Although only 
one NUP62-NUP88-NUP214 complex is mod- 
estly resolved in the protomer cryo-ET map, 
stochiometric and steric arguments suggest 
that there are two copies per protomer (or 16 
copies per CR). Overall, the three studies of 
human NPCs arrive at a reassuringly similar 
conclusion about the positioning of NUPs. 

Zhu et al. and Fontana et al. pursued a dif- 
ferent approach. Oocytes from the African 
clawed frog (Xenopus laevis) have nuclear 
membranes packed with NPCs, which en- 
ables single-particle reconstruction by use of 
cryo-electron microscopy (cryo-EM). In cryo- 
EM, many more particles are averaged com- 
pared with that in cryo-ET, which boosts res- 
olution. The studies of Zhu e¢ al. and Fontana 
et al. visualize a CR protomer at up to ~4 
and ~7 A, respectively. With well-resolved 
secondary structure elements, and even res- 
idue-level resolution for substantial parts of 
the protomer, these studies enable detailed 
analysis. The findings suggest that the core 
scaffold of the vertebrate NPC is well con- 
served. As previously determined, the nine- 
membered core Y complex forms two concen- 
tric, interwoven eight-membered rings (/-3). 
The five NUP358 amino-terminal a-solenoid 
structures per protomer are very similar in 
both analyses, as are two NUP205 molecules. 
Beyond that, the structures differ. Fontana et 
al. resolve two NUP62-NUP88-NUP214 com- 
plexes, as predicted by the cryo-ET-based 
studies, but Zhu et al. do not. Instead, they 
exclusively position two NUP93 molecules at- 
tached to the CR. These differences may be 
explained by different data processing strat- 
egies, slightly different sample preparations, 
or both. More likely, though, they point to a 
larger issue, which is the intrinsic heteroge- 
neity of NPCs. 

A limitation of these studies is that to 
achieve high resolution, many NPCs, and 
specifically individual protomers, need to 
be averaged. Given the high degree of struc- 
tural redundancy among scaffold NUPs, it 
is expected that NPCs are conformationally 
and constitutionally heterogenous (3, 17). A 
recent structural analysis of NPCs from the 
yeast Saccharomyces cerevisiae provided the 
clearest example of this to date, in which dif- 
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ferent compositional classes of NPCs were 
visualized (72). The more ambiguous parts 
of the different reconstructions published in 
this issue may point to inadvertent averag- 
ing over nonidentical protomers, resulting in 
partially distorted densities. This is difficult 
to separate from flexibility, which commonly 
reduces map contrast. 

A prominent example of conformational 
heterogeneity within the human NPC is the 
recent discovery that the central channel 
is substantially wider in intact cells when 
compared with partially purified NPCs (57 
versus 43 nm, respectively) (73, 14). This con- 
formational change is largely confined to the 
IR and LR, not the two outer rings. It may 
thus not be a coincidence that the CR is the 
best-resolved NPC element, likely reflecting 
higher homogeneity. How can the IR adopt 
such different conformations? The consensus 
from the studies of Mosalaganti et al. and 
Petrovic et al. is that the protomers do not 
structurally change but rather move as rigid 
blocks, enabled by highly flexible linkers that 
keep the protomers connected. This way, pe- 
ripheral channels are established that enable 
transport of embedded membrane proteins 
from the outer to inner nuclear membrane, 
and past the NPC. 

Overall, the studies of Bley et al., Petrovic et 
al., Mosalaganti et al., Zhu et al., and Fontana 
et al. substantially advance the knowledge 
of the assembly of NPCs, primarily in ver- 
tebrates. NPCs can now be functionally and 
structurally probed in unprecedented detail. 
Although much of the approach by the differ- 
ent researchers can be likened to solving a gi- 
ant jigsaw puzzle, it is surprising that several 
identical pieces fit into many different posi- 
tions. It will be interesting to tease apart the 
potential functional relevance of this unusual 
binding behavior and to reveal NPC biology 
at the level of detail comparable with that of 
other central problems in cell biology. 
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Aligning 
mealtimes to 
live longer 


Calorie restriction, fasting, 
and circadian rhythms sync 
together for a long, healthy 
life in mice 


By Shaunak Deota and Satchidananda Panda 


alorie restriction (CR) involves 

chronic reduction of energy intake by 

20 to 40% without inducing malnutri- 

tion (7). CR extends life span in mul- 

tiple animal models and reduces the 

risk of age-associated disorders, most 
of which arise from metabolic dysfunction 
and inflammation. However, extended daily 
fasting or aligning daily meal timing to the 
active period, even without reducing energy 
intake, can also improve health and increase 
life span in model organisms. On page 1192 
of this issue, Acosta-Rodriguez et al. (2) re- 
veal the specific contribution of fasting and 
timing of calorie-reduced meals to the effi- 
cacy of CR, as estimated by life-span exten- 
sion in male mice. 

In most rodent CR studies, the control 
group is fed ad libitum (AL), whereas the 
CR animals are fed a single meal per day 
that contains ~20 to 40% fewer calories than 
the AL group consumes. The CR animals eat 
most of their daily ration within 2 hours (3). 
Thus, CR studies inadvertently introduce 
timed or time-restricted feeding and pro- 
longed daily fasting, both of which can im- 
prove health and delay aging independently 
of CR (4). Disentangling the effect of CR, fast- 
ing, and time of feeding on rodent life span is 
not easy. Acosta-Rodriguez et al. developed a 
system that automatically delivers a specific 
quantity of food in a bolus, as in standard CR 
studies, or in small meals at specific times (3). 
All mice were housed under 12 hours of light 
and dark to synchronize circadian rhythms 
(internal body clock), and all cages were 
equipped with wheels that measured volun- 
tary wheel-running. 

To assess the impact of calories alone on 
life span, they split the CR ration (30% re- 
duced calories relative to the AL group) into 
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Newly resolved components of the human nuclear 
pore complex include the luminal ring (orange), 
cytoplasmic filaments (yellow), and lipid membrane 
(white). The structure known in 2016 is in blue. 


positioned in two locations of the CR and 
one position of the NR, in entirely different 
environments than in the IR. One rationale 
for this moonlighting function, at least in 
the CR, is the role of NUP93 in anchoring 
the cytoplasmic NUP62-NUP88-NUP214 
FG-containing complex, which is analogous 
to the central NUP62-NUP58-NUP54 FG- 
containing complex in the IR. Although only 
one NUP62-NUP88-NUP214 complex is mod- 
estly resolved in the protomer cryo-ET map, 
stochiometric and steric arguments suggest 
that there are two copies per protomer (or 16 
copies per CR). Overall, the three studies of 
human NPCs arrive at a reassuringly similar 
conclusion about the positioning of NUPs. 

Zhu et al. and Fontana et al. pursued a dif- 
ferent approach. Oocytes from the African 
clawed frog (Xenopus laevis) have nuclear 
membranes packed with NPCs, which en- 
ables single-particle reconstruction by use of 
cryo-electron microscopy (cryo-EM). In cryo- 
EM, many more particles are averaged com- 
pared with that in cryo-ET, which boosts res- 
olution. The studies of Zhu e¢ al. and Fontana 
et al. visualize a CR protomer at up to ~4 
and ~7 A, respectively. With well-resolved 
secondary structure elements, and even res- 
idue-level resolution for substantial parts of 
the protomer, these studies enable detailed 
analysis. The findings suggest that the core 
scaffold of the vertebrate NPC is well con- 
served. As previously determined, the nine- 
membered core Y complex forms two concen- 
tric, interwoven eight-membered rings (/-3). 
The five NUP358 amino-terminal a-solenoid 
structures per protomer are very similar in 
both analyses, as are two NUP205 molecules. 
Beyond that, the structures differ. Fontana et 
al. resolve two NUP62-NUP88-NUP214 com- 
plexes, as predicted by the cryo-ET-based 
studies, but Zhu et al. do not. Instead, they 
exclusively position two NUP93 molecules at- 
tached to the CR. These differences may be 
explained by different data processing strat- 
egies, slightly different sample preparations, 
or both. More likely, though, they point to a 
larger issue, which is the intrinsic heteroge- 
neity of NPCs. 

A limitation of these studies is that to 
achieve high resolution, many NPCs, and 
specifically individual protomers, need to 
be averaged. Given the high degree of struc- 
tural redundancy among scaffold NUPs, it 
is expected that NPCs are conformationally 
and constitutionally heterogenous (3, 17). A 
recent structural analysis of NPCs from the 
yeast Saccharomyces cerevisiae provided the 
clearest example of this to date, in which dif- 
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ferent compositional classes of NPCs were 
visualized (72). The more ambiguous parts 
of the different reconstructions published in 
this issue may point to inadvertent averag- 
ing over nonidentical protomers, resulting in 
partially distorted densities. This is difficult 
to separate from flexibility, which commonly 
reduces map contrast. 

A prominent example of conformational 
heterogeneity within the human NPC is the 
recent discovery that the central channel 
is substantially wider in intact cells when 
compared with partially purified NPCs (57 
versus 43 nm, respectively) (73, 14). This con- 
formational change is largely confined to the 
IR and LR, not the two outer rings. It may 
thus not be a coincidence that the CR is the 
best-resolved NPC element, likely reflecting 
higher homogeneity. How can the IR adopt 
such different conformations? The consensus 
from the studies of Mosalaganti et al. and 
Petrovic et al. is that the protomers do not 
structurally change but rather move as rigid 
blocks, enabled by highly flexible linkers that 
keep the protomers connected. This way, pe- 
ripheral channels are established that enable 
transport of embedded membrane proteins 
from the outer to inner nuclear membrane, 
and past the NPC. 

Overall, the studies of Bley et al., Petrovic et 
al., Mosalaganti et al., Zhu et al., and Fontana 
et al. substantially advance the knowledge 
of the assembly of NPCs, primarily in ver- 
tebrates. NPCs can now be functionally and 
structurally probed in unprecedented detail. 
Although much of the approach by the differ- 
ent researchers can be likened to solving a gi- 
ant jigsaw puzzle, it is surprising that several 
identical pieces fit into many different posi- 
tions. It will be interesting to tease apart the 
potential functional relevance of this unusual 
binding behavior and to reveal NPC biology 
at the level of detail comparable with that of 
other central problems in cell biology. 
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Aligning 
mealtimes to 
live longer 


Calorie restriction, fasting, 
and circadian rhythms sync 
together for a long, healthy 
life in mice 


By Shaunak Deota and Satchidananda Panda 


alorie restriction (CR) involves 

chronic reduction of energy intake by 

20 to 40% without inducing malnutri- 

tion (7). CR extends life span in mul- 

tiple animal models and reduces the 

risk of age-associated disorders, most 
of which arise from metabolic dysfunction 
and inflammation. However, extended daily 
fasting or aligning daily meal timing to the 
active period, even without reducing energy 
intake, can also improve health and increase 
life span in model organisms. On page 1192 
of this issue, Acosta-Rodriguez et al. (2) re- 
veal the specific contribution of fasting and 
timing of calorie-reduced meals to the effi- 
cacy of CR, as estimated by life-span exten- 
sion in male mice. 

In most rodent CR studies, the control 
group is fed ad libitum (AL), whereas the 
CR animals are fed a single meal per day 
that contains ~20 to 40% fewer calories than 
the AL group consumes. The CR animals eat 
most of their daily ration within 2 hours (3). 
Thus, CR studies inadvertently introduce 
timed or time-restricted feeding and pro- 
longed daily fasting, both of which can im- 
prove health and delay aging independently 
of CR (4). Disentangling the effect of CR, fast- 
ing, and time of feeding on rodent life span is 
not easy. Acosta-Rodriguez et al. developed a 
system that automatically delivers a specific 
quantity of food in a bolus, as in standard CR 
studies, or in small meals at specific times (3). 
All mice were housed under 12 hours of light 
and dark to synchronize circadian rhythms 
(internal body clock), and all cages were 
equipped with wheels that measured volun- 
tary wheel-running. 

To assess the impact of calories alone on 
life span, they split the CR ration (30% re- 
duced calories relative to the AL group) into 
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nine equal meals delivered every 
160 min (CR-spread), thus mini- 
mizing the chance of prolonged 
fasting. To assess the contribu- 
tion of the length of fasting, they 
delivered the same CR ration 
within a 2-hour window (CR-2h 
or 22-hour fasting) or split into 
eight equal meals delivered ev- 
ery 90 min over 12 hours (CR-12h 
or 12-hour fasting). To test the a 
impact of the CR ration fed dur- B 
ing the day or night, they deliv- 
ered the CR-2h and CR-12h in 
the day (CR-day-2h, CR-day-12h) 


. 
or night (CR-night-2h, CR-night- é 
12h) (see the figure). 

Despite all CR groups receiv- 1* 


ing the same quality and quan- 

tity of diet, their life spans dif- Ry 
fered. In the CR-spread group, 
the median life span was 10% 
longer than that of the AL group. 
All other CR groups that had 12- 
hour or 22-hour fasting lived lon- 
ger than the CR-spread group, 
demonstrating that fasting can 
boost the life-extending effect 
of CR. However, the median life 
span of CR-day-2h or CR-day-12h 
mice fed during their daytime 
rest period was extended by 20%, 
whereas mice in the CR-night-2h 
and CR-night-12h groups (when 
they are active) lived almost 35% 
longer than mice in the AL group 
(2). Thus, the life-extending effect of fasting 
is further boosted when it overlaps with the 
circadian sleep period. 

Despite eating less, all mice at middle age 
in the CR groups were more active than mice 
in the AL group. CR and fasting increase the 
production of ketone bodies from the liver, 
which can act on the circadian clock in the 
brain to increase food-seeking activity (5). 
Physical activity and exercise exert pleiotro- 
pic health benefits in mice and humans (6). 
Accordingly, as the mice aged, more-active 
mice also lived longer. However, among the 
CR groups, the CR-spread mice were more 
active in older age and yet lived the short- 
est amount of time. The CR-spread mice had 
slightly more daytime activity, coinciding 
with their mealtime, and there was a trend 
toward daytime activity and reduced life 
span. The CR-day mice also had slightly more 
daytime activity, and they did not live as long 
as the CR-night mice. Although sleep was not 
measured, being more active during the day 
likely disrupts sleep in nocturnal rodents. 
Altogether, these results imply that maintain- 
ing a robust sleep-wake and fasting-feeding 
cycle aligned with the circadian clock, can 
boost the life span-extending effect of CR. 


otte 
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Meal times and calorie restriction affect longevity 
Circadian alignment of feeding to the most active part of the day (nighttime in 
mice) and fasting boosts the life span-extending effects of calorie restriction (CR). 
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The fasting-feeding cycle in all CR groups 
except CR-spread likely sustained better cir- 
cadian rhythms. Although the light-dark 
cycle entrains the central circadian clock and 
supports sleep (7), the fasting-feeding cycle 
sustains robust circadian rhythms in periph- 
eral organs (8). Hence, optimum alignment 
of fasting and feeding with the light-dark 
cycle can maintain robust circadian rhythms, 
which in turn activates numerous pathways 
in different organs at the optimum time (9). 

Aligning fasting-feeding with the circadian 
clock-programmed sleep-wake cycle to gain 
health benefits has some precedence. In day- 
active fruit flies, nightly fasting leads to life- 
span extension compared with daytime fast- 
ing (0). Mutations in circadian clock genes 
disrupt feeding and sleeping patterns and 
also dampen CR-dependent life-span exten- 
sion in fruit flies and mice (11, 12). Even when 
mice are fed an isocaloric diet, those fed dur- 
ing the active phase have improved health 
and extended life span (8). 

To investigate the cause of death, Acosta- 
Rodriguez et al. examined mice that were 
found dead of old age or moribund mice that 
were euthanized. They found that cancer was 
the major cause of death in all groups and 


Circadian alignment 
and fasting effect 


1200 1300 


that liver cancer was the most 
prevalent type, suggesting that CR 
delayed the onset or severity of 
cancer. To gain clues about the un- 
derlying mechanism, the authors 
probed the liver transcriptomes 
of all cohorts. CR attenuated the 
age-associated gene expression 
changes observed in AL mice and 
maintained a gene expression sig- 
nature indicative of better meta- 
bolic homeostasis. The CR-night 
group predominantly exhibited 
changes in immune gene expres- 
sion, suggesting reduced inflam- 
mation. This group also had more 
genes expressed in alignment 
with circadian rhythms compared 
with CR-day groups. Circadian 
gene expression temporally op- 
timizes cellular processes, which 
may partly explain the longer life 
span of CR-night groups. 

Beyond the liver transcrip- 
tomes, there was little clue about 
life-span effects in different CR 
groups. Body weight and body 
composition were comparable 
among all CR groups. Likewise, 
in humans, 25% CR with or with- 
out 8-hour time-restricted eating 
(8-hour CR or 16-hour fast) for 1 
year led to similar weight loss (73). 
Conversely, in weight-stabilized 
humans, 6-hour time-restricted 
eating improved cardiometabolic 
health compared with habitual eating within 
12-hours (J/4). This implies that changes in 
body weight or composition in response to 
dietary interventions that involve CR, fasting, 
or meal timing may not accurately predict the 
overall effectiveness of these interventions, al- 
though perhaps trials with larger cohorts and 
more restrictive timing are needed. Studies 
of plasma and tissues from people undergo- 
ing CR, fasting, or meal-time restriction hold 
untapped potential for understanding the 
system-wide impact of these dietary interven- 
tions and predicting the extent to which they 
can prevent or manage chronic diseases. 
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lmmuno-epidemiology and the 
predictability of viral evolution 


Understanding viral evolution depends on a synthesis of 
evolutionary biology and immuno-epidemiology 


By Chadi M. Saad-Roy', C. Jessica E. 
Metcalf’, Bryan T. Grenfell?* 


espite much recent progress in mod- 

eling the epidemiology and evolution 

of acute viruses, a full quantitative 

synthesis of viral eco-evolutionary 

dynamics remains elusive. The severe 

acute respiratory syndrome coronavi- 
rus 2 (SARS-CoV-2) pandemic has stimulated 
vast research efforts into measuring viral dy- 
namics and genetics, across scales from indi- 
vidual hosts to global circulation. In parallel, 
understanding determinants of individual 
immune protection against infection and se- 
vere disease has been a major research focus. 
However, the interaction between population 
immunity and viral dynamics has been much 
less studied—a crucial gap because this in- 
teraction will strongly influence SARS-CoV-2 
evolution over time. Clarifying the biology of 
transmission at different scales, and in par- 
ticular the impact of immunity on transmis- 
sion, will define the epidemiological context 
(current spread and population risk), as well 
as the epidemiological and evolutionary im- 
plications, of immune escape. 

To probe likely evolutionary trajectories, 
it is necessary to understand how immu- 
nity intersects with transmission and thus 
population outcomes, especially in partially 
immune individuals. Building such a “phy- 
lodynamic” synthesis (7) requires a revolu- 
tion in cross-scale understanding of how 
individual immune kinetics translate to im- 
muno-epidemiology. The resulting frame- 
works will provide a more mechanistic ba- 
sis for exploring the predictability of viral 
evolutionary dynamics. Such mechanistic 
approaches synthesize findings across dis- 
ciplines, and they will provide a principled 
way to integrate information across scales 
and inform data analyses. 

Phylodynamic models meld the epide- 
miological and evolutionary dynamics of 
pathogens, often with underlying host im- 
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mune kinetics. Such frameworks have been 
applied across a range of acute and chronic 
pathogens (J, 2), and often focus on influ- 
enza eco-evolutionary dynamics as a model 
for acute partially immunizing infections. 
In particular, the immune escape of sea- 
sonal variants of human influenza A virus 
has been used to study the impact of popu- 
lation immunity on viral population dy- 
namics. The simplest conceptual qualitative 
phylodynamic models for immune escape 
(1) posited that spread of escape variants 
would be maximized at intermediate im- 
mune pressure, through a trade-off between 
transmission and immune selection. That 
is, if there is no immune pressure, then vi- 
ral abundance may be high but selection 
for immune escape is absent. Conversely, 
if there is strong immune pressure, then 
selection may be high but viral abundance 
very low; this would also limit adaptive evo- 
lution for immune escape. 

More detailed dynamic models for in- 
fluenza have addressed quantitative phy- 
lodynamic interactions across scales (from 
within hosts to global spread). These models 
also explored what limits viral diversity to 
the observed phylogeny that emerges from 
antigenic drift (changes in surface proteins 
that lead to immune escape), given the huge 
variation generated by error-prone viral re- 
production (3, 4). An array of related fram- 
ings of immunodynamics in various guises 
[such as strain-transcending immunity (3) 
and repeated selective sweeps driven by 
herd immunity (4)] have been explored as 
candidates to explain these phylodynamic 
patterns for influenza A virus. 

Phylodynamic models have been widely 
applied beyond seasonal influenza (2). In 
particular, the dynamics of seasonal hu- 
man coronaviruses (HCoVs) are becoming 
increasingly salient, especially owing to 
the continued spread of SARS-CoV-2. For 
example, the 229E HCoV exhibits antigenic 
drift (5), which could have implications for 
the emergence of SARS-CoV-2 variants. 
Indeed, the ongoing COVID-19 pandemic 
underlines the major impact not only of 
immune escape evolution, but also of rapid 
selection for (surprisingly large) increases 
in viral transmission rate. Although the 


acquisition of human-to-human transmis- 
sion by influenza viruses that caused pan- 
demics is arguably an example of a selec- 
tion on transmission, this is not as evident 
for seasonal influenza. 

A key issue for next-generation phylo- 
dynamics is to determine how selection at 
different levels (within-host, transmission 
chains, population-level, globally) trans- 
lates into population outcomes, and how 
it is modulated by host immunity. There 
are important processes present at each 
biological scale, from the emergence of vari- 
ants to their global spread (supplementary 
fig. S1). Within individual hosts, variants 
arise through mutation and/or recombina- 
tion (1). The ability of variants to replicate 
(which is necessary for successful transmis- 
sion) will be affected by their cellular tro- 
pism and by the efficiency with which they 
can enter cells and transmit across tissues. 
For example, increases in angiotensin-con- 
verting enzyme 2 (ACE2) avidity may lead 
to enhanced SARS-CoV-2 transmissibility 
(6) and may also alter tissue tropism, e.g., 
for the upper respiratory tract (7). Tropism 
could also depend on host immune re- 
sponses; adaptive immunity (acquired 
through infection or vaccination) may also 
shape viral load trajectories [e.g., for influ- 
enza (8)] and lead to selection for immune- 
escape variants [e.g., the Omicron variant 
(9)]. Tropism and adaptive immunity also 
likely play a role in the clinical severity of 
infections (and reinfections) with variants. 

The combination of individual immune 
phenotypes and their impact on viral shed- 
ding affects the transmission of variants 
and determines whether they can cause 
breakthrough infections in vaccinated in- 
dividuals. Simultaneously, antigenic drift or 
waning immunity may influence suscepti- 
bility to (re)infection. In turn, these lead to 
fitness advantages for variants with either 
increased immune escape or transmissibil- 
ity. Individual characteristics of immunity 
might be particularly important in shaping 
the features of the virus. For example, selec- 
tion of influenza viruses within immunolog- 
ically competent hosts is less strong owing 
to asynchrony between viral growth and im- 
mune response (8), whereas prolonged car- 
riage in immunocompromised hosts could 
result in variants (10); this may also be the 
case for SARS-CoV-2 (11). 

At the population level, the presence of 
many immune individuals could limit vi- 
ral spread through indirect protection. To 
prevent the establishment of variants with 
increased transmissibility (but with little 
immune escape) once they have emerged, 
a higher proportion of immune individu- 
als are required than that needed to limit 
transmission of the original virus. Indirect 
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protection from immune escape variants 
will depend on specific host and viral prop- 
erties, including susceptibility to reinfec- 
tion and transmission from breakthrough 
infection. Additionally, global movement 
seeds new SARS-CoV-2 outbreaks, poten- 
tially in regions with vastly different levels 
of preexisting infection-induced or vaccinal 
immunity and other epidemiological fac- 
tors. Depending upon this immune land- 
scape, population-level selection may favor 
variants with either higher transmissibility 
or increased immune escape. Inequitable 
vaccine coverage increases global infection 
levels, which increases the likelihood for 
variants with either of these characteristics. 

To untangle the effects of immunity on 
transmission, it will be crucial to quantify 
the cross-scale impact of host immune re- 
sponses on epidemiology and viral evolu- 
tion. Although several studies have made 


progress in this general direction (8, 0), 


surprises compared to expectations honed 
by the dynamics of other viruses, notably 
influenza virus. For example, the observa- 
tion that some hosts sicken with COVID-19, 
but do not then strongly seroconvert (12), 
complicates interpretation of measuring 
antibody titers for dissecting the impact 
of population immunity. Assuming these 
immunologically “cryptic” individuals are 
capable of virus transmission, do they con- 
tribute to net evolution of the virus (e.g., 
owing to the presence of unmeasured im- 
munity)? It is also unclear how host im- 
mune responses besides neutralizing anti- 
bodies (such as T cell immunity, which is 
harder to measure) affect evolutionary tra- 
jectories. Moreover, whether existing popu- 
lation immunity to endemic viruses creates 
a cross-protective firewall against emergent 
pathogen spillover, and how this might oc- 
cur, needs further investigation (as does 
the potential impact of animal reservoirs). 


Studying immuno-epidemiology 


Cohort studies are needed to understand the impact of immunity on phylodynamic evolutionary patterns. 
There are two classes of unknowns: the impact of vaccination for individuals with different degrees of immunity, 
and the impact of vaccination on subsequent susceptibility to and transmissibility of breakthrough infections. 


Vaccination 


Serology 


Recovery 


Viral load, 
sequences 


Monitoring across 
ages and degree of 
immune competence 


there remain important gaps that need to 
be addressed. In particular, epidemiologi- 
cal studies that simultaneously quantify 
immunity and transmission across popula- 
tions and at different biological scales are 
necessary, along with the development of 
cross-scale modeling approaches. There are 
a number of study designs and associated 
measurements that could address these 
gaps and determine the impact of immu- 
nity on cross-scale phylodynamics (see the 
figure). Long-term longitudinal studies 
that monitor immunity across cohorts and 
ages, in addition to measuring viral loads, 
sequence variation, and household trans- 
mission, are essential. Using these data in 
cross-scale modeling frameworks could elu- 
cidate the impact of immunity on transmis- 
sion and viral variation. 

The phylodynamics of the COVID-19 pan- 


— 


demic have provided many—often nasty— 
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Perhaps the central question for future 
SARS-CoV-2 variants is the relationship be- 
tween transmission rate, immune escape, 
and clinical severity. Specifically, is there a 
“worst case” viral genotype that combines 
high transmission, immune escape, and se- 
verity? This will depend in a complex way 
on—still partially understood—cross-scale 
viral and immune dynamics. Unfortunately, 
the virus is likely to explore these phylody- 
namics before they can be predicted, espe- 
cially given continued transmission arising 
from global inequities in vaccine supply. 
One general strategy for exploring these 
complexities is to generate phylodynamic 
case law by broadening studies to explore 
the phylodynamics of multiple pathogens 
using recently developed multiplex immu- 
nological methods (13) with viral sequenc- 
ing. Analyzing immunity against multiple 
pathogens and variants will also address 


issues such as antigenic imprinting (the im- 
munity conferred after recovery from the 
first infection) in influenza, the potential 
contribution of immunologically cryptic 
individuals to pathogen evolution, serotype 
interactions in dengue, and the polymicro- 
bial impact of nonpharmaceutical interven- 
tions used to mitigate COVID-19. 

New phylodynamic models and data 
structures should be developed to answer 
these questions. For example, studies have 
pioneered statistical methods to determine 
transmission patterns and evolutionary 
history from sequence data (14), and a fit- 
ness-based model to predict future inci- 
dence of influenza clades (75). Combining 
either approach with models that account 
for within-host kinetics and immuno- 
epidemiology may be a particularly fruit- 
ful avenue to unravel the impact of host 
immunity on pathogen eco-evolutionary 
dynamics. For SARS-CoV-2 specifically, 
it is important to quantify immune land- 
scapes at all spatial scales (local, regional, 
and global). Crucially, this quantification 
will enable the development of tools and 
infrastructure that will aid in predicting 
future epidemic and evolutionary trajec- 
tories. These suggested approaches echo 
proposals for a Global Immunological 
Observatory (i.e., regular mass blood sam- 
pling to monitor immunological signa- 
tures) (73), but with an important evolu- 
tionary focus to elucidate the cross-scale 
impacts of immunity on pathogen dynam- 
ics. More generally, such immuno-epidemi- 
ology and viral discovery could help eluci- 
date the phylodynamics and vaccinology of 
a much wider range of pathogens. 
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CLIMATE CHANGE 


Land management can contribute to net zero 


The voluntary carbon market needs to embrace changes for the land sector 


By Ruth DeFries 14, Richie Ahuja, Julio 
Friedman**, Doria R. Gordon®*, Steven P. 
Hamburg’, Suzi Kerr, James Mwangi’®, 
Carlijn Nouwen’, Nitin Pandit® 


emand for credits on the voluntary 

carbon market is poised to surge as 

corporations implement net-zero 

commitments. Approximately half of 

all credits issued from 2000 to 2021 

on the voluntary carbon market re- 
lated to land use, mostly from forest proj- 
ects (fig. S1) (2). Credits from the land sec- 
tor pose challenges for accounting and for 
meeting multiple criteria. We propose three 
pathways to overcome shortcomings in the 
carbon market, improve integrity of credits, 
and promote long-lasting change to achieve 
nontrivial climate mitigation and co-benefits 
from the land sector: (i) target major sources 
of land-based emissions by increasing ac- 
tivities that reduce or avoid non-CO, green- 
house gas (GHG) emissions; (ii) promote 
longevity of low-GHG land management by 
ensuring that locally relevant co-benefits ac- 
crue to local land users; and (iii) encourage 
region-wide over individual project-based 
activities to promote systemic change, pro- 
vide equitable access to benefits, enable real- 
istic accounting, and scale opportunities for 
emissions reductions. 

Agriculture, forestry, and land use account 
for ~20% of global anthropogenic GHGs 
(from enteric fermentation in livestock and 
manure, agricultural soils, crop burning, de- 
forestation, cropland degradation, and rice 
cultivation in decreasing order of emissions) 
(2). Reducing emissions from this sector is 
essential to address the goals of the Paris 
Agreement. In addition, restoration and im- 
proved management of multiple types of eco- 
systems can provide opportunities to remove 
carbon from the atmosphere and store it in 
the biosphere, although the magnitudes are 
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uncertain and potential land conflicts raise 
serious concerns. Estimates indicate that up 
to 60% of emissions from agriculture rela- 
tive to 2030 business-as-usual projections 
and 110% of emissions from the forestry sec- 
tor are technologically and economically fea- 
sible to reduce (3). 

Multiple approaches can economically 
incentivize reduced emissions and carbon 
sequestration from land management, in- 
cluding fines for violating regulations, sub- 
sidies and tax credits, capped emission in 
cap-and-trade programs, and payments. All 
of these approaches can play a role in a tran- 
sition to low-GHG land management. The 
current net-zero commitments and explosive 
growth in the voluntary carbon market of- 
fer an unprecedented opportunity to bring 


“Ashortcoming of the current 
project-based carbon 
market is its small-scale, 
piecemeal approach 
through individual projects.” 


finance to the land sector for mitigating cli- 
mate change, if criteria and accounting can 
overcome problems of low integrity and in- 
equitable benefit sharing. 

The integrity of land-based credits in the 
voluntary carbon market has been histori- 
cally uneven. Buyer confidence can only oc- 
cur with high-integrity credits—in other 
words, assurance that credits represent doc- 
umented, actual reductions, avoidance, or 
removal of GHGs that would otherwise not 
occur. This integrity needs to improve sub- 
stantially for these types of investments to 
effectively reduce GHG emissions or remove 
them from the atmosphere in voluntary car- 
bon markets. 

Multiple factors contribute to low confi- 
dence in carbon market credits, including 
difficulties with assurances of additionality 
(whether interventions to reduce emissions 
or store carbon would occur in the absence 
of revenue from the carbon market), per- 
manence (the integrity of how reversed 
benefits of avoided or stored emissions are 


addressed), leakage (whether reductions 
in one place are displaced by emissions to 
another place), and quantification (whether 
reductions are accurately quantified relative 
to an appropriate baseline and reported). 
Another factor contributing to low confi- 
dence is the potential for displacement and 
inequitable benefit sharing with marginal- 
ized and Indigenous peoples in places with 
conflicts between customary and statutory 
land tenure, asymmetric power relations, or 
poor governance. Such negative outcomes 
have occurred with previous implementa- 
tion of financing for forest-based climate 
mitigation (4). 

In the absence of improved integrity, car- 
bon markets are likely to shift toward proj- 
ects that buyers perceive as higher quality, 
such as long-term geologic storage from 
carbon capture and storage, that more easily 
meet the criteria. Without alternative path- 
ways to improve integrity of credits from the 
land sector, opportunities for the sector to 
contribute to cost-effective climate mitiga- 
tion and its potential co-benefits for biodiver- 
sity and local livelihoods could be negated. 


NON-CO, GHGS 

The Sixth Assessment of the Intergovern- 
mental Panel on Climate Change highlights 
the opportunity to reduce non-CO, GHGs, 
particularly methane (CH,), as a mecha- 
nism to reduce the rate of warming over 
the near to medium term and provide re- 
lief from climate change. Land-based activ- 
ities are substantial anthropogenic sources 
of non-CO, GHGs, which drive the bulk of 
warming between now and 2050. Enteric 
fermentation, manure, and rice cultivation 
together emit more CH, than fossil fuels, 
and agriculture emits roughly four times 
as much nitrous oxide (N,O) as fossil fuels 
(table S1). Until the recent pledge to reduce 
CH, emissions from a diversity of sectors 
by 30% by 2030, adopted by more than a 
hundred countries at the 26th Conference 
of Parties in Glasgow, there has been no 
systematic effort to drive down these criti- 
cal non-CO, emissions. 

Currently, credits issued and number of 
projects in the voluntary carbon market 
are overwhelmingly related to projects that 
avoid or reduce CO, emissions or sequester 


10 JUNE 2022 » VOL 376 ISSUE 6598 1163 


INSIGHTS | POLICY FORUM 


carbon (see the figure and tables S2 
and S3). Land use-related CH, and 
N,O cause more than half of the 
climate impacts from the land sec- 
tor. However, only 2% of the credits 
issued for land-based projects from 
1996 to 2021 aimed to reduce emis- 
sions of CH,, and practically none 
addressed N,O (7). In terms of num- 
ber of all land-based projects, 17% 
aimed to reduce CH, and 1% N,O 
(fig. S2). As an example of the imbal- 
ance between sources of emissions 
and credits in the market to reduce 
those emissions, enteric fermen- 
tation emits almost as much CO, 
equivalents (100-year time frame) 
globally as does forest conversion, 
yet only about 300 credits (two 
projects) addressed CH, reductions 
through feed additives in contrast 
to more than 300 million (140 proj- 
ects) for forest-focused projects, de- 
spite the relatively low or negative 
costs per tonne of CO, equivalent of 
many of these activities. 

For the carbon market to make 
substantial contributions to reduc- 
ing emissions of multiple GHGs, the 
disproportionately high reliance on 
forest-related projects needs to be 
augmented by increased focus on 
enteric fermentation in livestock, 
irrigation management, and other 
agricultural projects. In particu- 
lar, projects to reduce and avoid 
CH, are underrepresented in com- 
parison to its contribution to land- 
based emissions. Methods to es- 
tablish accurate baselines will help 
legitimize these initiatives. 


LOCAL BENEFITS 

Initiatives that reduce, avoid, or 
remove emissions benefit people 
indirectly through the global effect 
on climate. Local people involved 
in land management where the 
projects are implemented do not 
necessarily benefit directly beyond 
financial payments, which are of- 
ten meager after accounting for 


Ample emissions, insufficient credits 


(Top) Global emissions are shown from land-based sources of CO., CH,, 


and N,O in 2019 (in CO, equivalents on a 100-year time frame) (1 
(Bottom) Credits issued from all four major registries in the volu 
carbon market from 1996 through November 2021 (1) are categor 
the main greenhouse gas affected by the project. The top two high 


emission sources (top) and project types for credits issued (bottom) 


are shown for each gas (along with “other” indicating all source an 
project types not encompassed by the top two categories shown 
each gas). Approximately 10% of credits issued for forest projects 


buffers for reserve credits in the event of reversal. Tables S2 and S3 list 


the scale required to meaningfully 
contribute to climate mitigation. 
5). Without such direct benefits, car- 
ntary bon mitigation efforts are likely to 
ized by be short-lived and small-scale with 
est- limited effect on climate mitigation. 
With direct benefits, land users can 
d experience the advantages and pro- 
for mote shifts to low-GHG practices 
are within their communities. 

A carbon market that facilitates 


all land-based sources and project types. REDD+, Reducing emissi 
from deforestation and forest degradation. 
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ons transformations to low-GHG land 


management will value activities 
that are participatory and locally 
perceived as beneficial. For example, 
large-scale tree-planting schemes 
that aim to improve carbon seques- 
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tration and livelihoods are more 
effective if they include tree spe- 
cies preferred by local people for 
fuelwood, fodder, and grazing (6). 
Similarly, cookstoves that reduce 
the need for fuelwood, improve air 
quality, and save labor can shift 
norms toward more carbon-efficient 
sources of energy for cooking, and 
agroforestry can provide income 
for the local population while stor- 
ing almost as much carbon as native 
forests (7). This focus on local needs 
and preferences will likely prove 
more effective for climate mitigation 
than short-term projects with more 
indirect co-benefits. Carbon market 
standards should ensure that local 
benefits occur. 


800 


600 —— 


Security of land tenure underpins 
the ability of the carbon market to 
both deliver benefits to local land us- 
ers and create confidence for inves- 


400 


tors in land-use initiatives. Insecure 
tenure, gaps between de facto and 
de jure tenure, and colonial legacies 


Credits issued (millions) 


200 


of state control over land are com- 
mon throughout the world, particu- 
larly in developing regions. In such 


Reduce 


Reduce CO, 


Reduce CH, 


transactions costs and low carbon prices. 
However, some buyers will pay premiums 
for projects with social and environmental 
co-benefits that contribute to Sustainable 
Development Goals. Co-benefits can accrue 
locally, such as employment opportunities. 
Other co-benefits, such as habitat conserva- 
tion, affect local land users only indirectly 
or could have negative impacts such as in- 
creased human-wildlife conflict or displace- 
ment. In those cases, local land users are 
less likely to participate over the long term 
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or across larger regions, resulting in unreal- 
ized potential for climate mitigation. 
Evidence from land management projects 
indicates stronger outcomes for conserva- 
tion if local land users receive tangible ben- 
efits and participate in regional land use and 
management decisions (5). Direct benefits to 
local land users through sustained returns of 
financial capital, tangible ecosystem services, 
or cultural value are necessary for land-based 
activities to successfully reduce, avoid, or re- 
move emissions over the long term and at 


places, which account for as much 
as 65% of the world’s total land area 
(8), the carbon market can incentiv- 
ize local land users to participate 
only if outside parties will not lay 
claim to the benefits. Robust land- 
based initiatives in the carbon market need 
to clarify and ensure that benefits flow to 
those who manage the land. 


N,O 


LARGE AREAS, REGIONALLY BASED 

A shortcoming of the current project-based 
carbon market is its small-scale, piecemeal 
approach through individual projects. 
Consequently, the market has difficulty 
fostering transitions to low-GHG land man- 
agement over larger areas for a meaningful 
aggregated impact. 
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Jurisdictional and regional approaches, 
which aim to align governments, businesses, 
nongovernmental organizations, and local 
stakeholders around common goals for land 
management within an administrative unit, 
are becoming more common and offer 
possibilities for overcoming some of the 
shortcomings of a project-based approach 
[see (9) for examples]. Implementation of 
a jurisdictional approach faces hurdles in 
places with poor governance, corruption, 
and lack of enforcement, but these ap- 
proaches potentially provide an alternative 
to individual projects if participants in the 
carbon market can overcome these hurdles. 

Unlike individual project-based efforts, 
jurisdictions or region-wide organizations 
can foster low-GHG land management 
through credit and access to inputs for 
land users to alter feed for livestock, plant 
trees on farms, improve fertilizer manage- 
ment, eliminate crop residue burning, or 
other practices that require upfront in- 
vestments. These efforts would be coupled 
with enforced regulations within jurisdic- 
tions against illegal deforestation and fire 
and other incentives to foster a transition 
to low-GHG land management. For ex- 
ample, municipality-level initiatives in the 
Brazilian Amazon _ successfully reduced 
deforestation by linking credit to farmers 
with municipality-level deforestation rates 
to implement the federal Plan to Prevent 
and Control Amazon Deforestation. 

A carbon market that purchases credits 
from land users in jurisdictions that pro- 
mote low-GHG land management could 
also help address the existing disincentive 
to potential suppliers from high transactions 
costs. More land users in a jurisdiction could 
aggregate their projects and reduce costs. 
Supply of high-quality carbon credits has 
been hampered by high costs for individual 
project development, monitoring, and other 
transactions (0). Small-scale farmers, which 
globally constitute 84% of all farmers (JJ), 
and Indigenous communities, which manage 
a quarter of the world’s land area (12), are 
at a disadvantage to participate in the cur- 
rent carbon market, yet could make outsized 
contributions to effective climate mitigation. 
Pooled capabilities at a jurisdictional or re- 
gional scale to develop, monitor, and report 
could help facilitate participation of small- 
scale land users and promote durable trans- 
formation in land management. 

Although the evidence base is only begin- 
ning to develop, jurisdictional and regional 
approaches potentially lower transaction 
costs and reduce, though do not eliminate, 
problems of additionality and leakage (9). At 
a project level, additionality is compromised 
when voluntary participation in response to 
an economic incentive is biased toward land 
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users who already intended to change prac- 
tices. This “adverse selection” problem dimin- 
ishes as actors are grouped within larger ju- 
risdictions and jurisdiction-wide support for 
low-GHG land management enables other 
land users to participate. Ambitious baselines 
below business as usual and dynamic adjust- 
ments in a structure predefined by crediting 
institutions could help reduce the problem 
of adverse selection at a jurisdictional level. 
Jurisdiction-wide monitoring alleviates the 
problem of leakage within jurisdictions be- 
cause credits are issued only for net changes 
at the jurisdictional scale. Leakage could still 
occur with neighboring jurisdictions and 
across national boundaries, a problem that 
also occurs with project-based markets and 
which is very difficult to quantify and address 
through market or other mechanisms. 

High-resolution remote sensing and 
artificial intelligence provide a new, low- 
cost way for jurisdictions and regions to 
monitor land management at a parcel level 
repeatedly across large areas. Many com- 
panies are pursuing means to both track 
land-use changes and combine that infor- 
mation with payment delivery platforms 
that land users can access through mobile 
phones. Efforts to pool project develop- 
ment, technical advice, monitoring, and 
payments are more feasible now than in 
the early days of the Clean Development 
Mechanism and voluntary carbon market. 
These tools can enable markets globally 
and help ensure that credits represent ac- 
tual climate mitigation. 

Despite these potential benefits of ju- 
risdictional and regional approaches for 
changing land management norms across 
large areas and addressing additional- 
ity and leakage, an overriding concern 
for both project-based and jurisdictional 
approaches is the need for fair and equi- 
table benefit sharing. Land rights and gov- 
ernance are deeply political and involve 
entrenched power dynamics (13). It is un- 
realistic to expect that the carbon market 
can resolve centuries-old problems of inse- 
cure land rights and procedural injustice. 
As a small step toward the larger goal of 
reducing these injustices, buyers on the 
carbon market can insist that strong and 
verified social safeguards, principles of 
informed consent, and fair benefit shar- 
ing are enforced as fundamental require- 
ments for registration. These safeguards 
are not currently applied uniformly across 
the standards for certifying credits for the 
voluntary carbon market (/4). Even when 
safeguards are applied, they are not a 
guarantee against unfair and inequitable 
practices, particularly where governments 
control forest lands that local people use 
customarily. More rigor is needed to be able 


to assure buyers that they are purchasing 
from places where land conflicts and gov- 
ernance are fairly and equitably managed. 

Where governance structures make such 
efforts possible, jurisdictional and regional 
approaches potentially reduce transaction 
costs for developing baselines, monitoring, 
and reporting; enable wider participation 
from land users; shift land use norms; and 
enhance integrity for buyers in the volun- 
tary carbon market. As jurisdictional ap- 
proaches are rapidly evolving, the research 
community plays a key role in developing 
the evidence base about problems and suc- 
cesses in implementation, particularly re- 
garding fair and equitable benefit sharing 
in places with insecure land rights. 

The urgency of the climate crisis de- 
mands effective interventions to reduce, 
avoid, and remove emissions in all sectors. 
Low-GHG land management is not a pana- 
cea. Yet, with efforts to reduce and remove 
emissions in other sectors, it can contrib- 
ute to a durable transformation toward 
addressing the crisis while providing local 
benefits for people and ecosystems. 
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SCIENCE LIVES 


To Psyche and beyond 


The future of US innovation will depend on teamwork, 
maintains a planetary scientist ina new memoir 


By Vijaysree Venkatraman 


his summer, a spacecraft will begin 

its 3.4-year journey toward Psyche, a 

huge metal-rich asteroid located be- 

tween Mars and Jupiter. Most other 

asteroids—there are ~1,500,000 of 

them in the asteroid belt—are rocky or 
icy bodies, but Psyche is thought to be the 
exposed nickel-iron core of an early planet. 
This metallic world, 173 miles at its widest, 
could offer planetary scientists a glimpse 
into the early days of the Solar System and 
the formation of Earth. 

In her new memoir, A Portrait of the 
Scientist as a Young Woman, Lindy Elkins- 
Tanton, the principal investigator of the 
NASA-funded Psyche mission, tells the en- 
gaging story of her life in science—from her 
tentative beginnings as an undergraduate 
student researcher in geology to her current 
position as a planetary scientist at Arizona 
State University’s School of Earth and Space 
Exploration. The book also offers insights 
into the workings of academia, followed by 
suggestions for restructuring the research 
enterprise to transform the pace of innova- 
tion and education in the United States. 

Elkins-Tanton was born into an upper 
middle-class family in Ithaca, New York, in 
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1956. Although she had material comforts, 
she did not have a happy childhood. Her re- 
lationship with her mother was fraught, and 
she battled depression in her youth. Quoting 
the poet Natalie Diaz, she writes: “You are 
not the sum of your injuries.” “But perhaps I 
am the sum of the injuries that I have over- 
come,” she adds. The realization 
that each of us is only a tiny part 
of a vast, unexplored Universe has 
both comforted and motivated 
her throughout the years. 

The author writes engagingly 
of her own research and reveals 
how she and other researchers 
tease out truths about the Uni- 


Lindy Elkins-Tanton, principal investigator of NASA's 
Psyche mission, discusses small-bodies missions. 


can bridge that gulf of space,” Elkins-Tanton 
writes. “We can send a robot to find out what 
Psyche really is.” 

Turning her attention to the broader 
business of modern research, Elkins-Tanton 
identifies the “hero model” of academic sci- 
ence as a major barrier to progress. In most 
universities, the leading scholar in any given 
area of research has ownership of a pyramid 
of resources dedicated to a given topic, she 
maintains. These “heroes” have an outsize 
influence on how knowledge is created, how 
it is funded, and how it is adopted and reg- 
ulated by society. It is time to bid goodbye 
to these heroes, she argues, and create an 
organizational culture that supports teams, 
knowledge goals, and societal outcomes. 

“The stereotype of a lone man in a lone 
lab making brilliant progress is a lie: every 
scientist needs to work with, contend with, 
and convince their community, and also, it’s 
so often now a woman and not a man,” she 
writes. To address big challenges in science 
and society, we need the breadth of ideas 
that comes from a diversity of voices. 

With the research community moving to- 
ward a more multidisciplinary, team-based 
model, Elkins-Tanton says it is also time for 
a new model of university education. She 
compares the traditional ways of teaching 
science and math at the undergraduate level 
to “trying to train dogs by using electric 
collars’—lots of judgment, little encourage- 
ment. Like graduate students, undergradu- 
ates should be taught research skills such as 
how to ask questions, synthesize 
information, and render opin- 
ions, she argues. 

Bullying and sexual harass- 
ment in academia are other top- 
ics that are discussed in detail. 
Should successful or important 
people in academia be allowed to 
persist in a community when they 


verse from bits of rock. Her in- of Fs Steir as are harming others? Her answer 
vestigations of the Siberian flood a Young Woman is an unequivocal no. Her stand 
basalts—the result of Earth’s larg- Lindy Elkins-Tanton on this topic, she writes, comes 
est land-based volcanic event, Se ee partly from her own experience of 


which occurred 250 million years 
ago—have revealed, for example, that the 
eruptions produced gases not unlike the 
ozone-depleting halocarbons humans pro- 
duce today. 

If we imagine our 4568-million-year-old 
Solar System as a 24-hour day, the upcom- 
ing Psyche mission could help us learn more 
about the building materials of rocky planets 
that were formed within the first 20 seconds. 
The launch of the spacecraft involves the 
efforts of some 800 people and 11 years of 
work. “While we can’t go back in time, we 


childhood abuse. 

This memoir chronicles the journey of 
one woman in science but is also a rallying 
cry to make academia a more supportive 
and diverse workplace so that the research 
community can better address the societal 
and scientific challenges of the 21st century. 
By the end of the book, Elkins-Tanton will 
have readers enthused about the upcom- 
ing Psyche mission and leave them with a 
greater appreciation of the teamwork that 
made it possible. @ 
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HISTORY OF MEDICINE 


The surgeon and the soldiers 


Plastic surgery pioneer Harold Gillies transformed facial reconstruction during World War I 


By Cathy Newman 


soldier who sustained serious 

trauma to the face during World 

War I bore a soul-crushing liability. 

Those missing a leg or arm were apt 

to evoke sympathy. But those un- 

lucky enough to “lose face”—in this 
instance, the term is literal as well as figu- 
rative—more often provoked disgust. Chil- 
dren, medical historian Lindsey Fitzharris 
writes in The Facemaker, would flee from 
the sight of their fathers. One wounded 
corporal, upon catching sight of his face in 
a mirror, wrote his fiancée to break their 
engagement. In Germany, they were 
called Menschen ohne Gesicht (men 
without faces); in France, les gueules 
cassées (broken faces). 

Military weaponry in World War I 
far outstripped the ameliorations of 
military medicine. The deadly, disfigur- 
ing arsenal included ammunition with 
magnesium fuses, shrapnel, mortar 
bombs, caustic gases such as chlorine 
and phosgene, and flame throwers. 
Wood and canvas airplanes, mean- 
while, were essentially stacks of tinder 
that fed fires started by exploding gas 
tanks. Most airmen carried a pistol to 
end their own lives in the event that the 
plane they were piloting caught fire, 
notes Fitzharris. 

It is poignant and revealing that sur- 
geon Harold Gillies banned mirrors in 
his wards. The New Zealand-born, Uni- 
versity of Cambridge-educated pioneer 
in the emergent field of plastic surgery 
at the heart of Fitzharris’s story under- 
stood the searing trauma of facial in- 
jury and disfigurement. 

Before enlisting in the Royal Army 
Medical Corps in 1915, Gillies had 
worked in the posh private practice of a 
laryngologist on call to London’s Royal 
Opera House, where his most memorable 
experience was ministering to a ballerina 
who had accidentally sat on a pair of scis- 
sors. “The brutal hothouse of frontline sur- 
gery” that was the Great War was different 
altogether. There was no textbook, no in- 
struction manual in the radical reconstruc- 
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tion that many soldiers’ injuries required, 
nor even the knowledge of how to admin- 
ister anesthesia to a patient with a gaping 
hole in the middle of his face. 

Skin grafting had been done in the 15th 
century to replace a nose tip sliced off by 
a dueling sword, but Gillies was working 
on a much larger canvas and understood 
that success depended on establishing and 
maintaining blood supply and warding off 
infection. He refined the procedure by lift- 
ing, but not severing, the graft from the do- 
nor site, stitching it into a tube, and moving 
the free end to the area of injury: his tubed 
pedicle became a reconstructive mainstay. 


Asoldier wounded in 1916 smiles after 
facial reconstruction surgery. 


But technical expertise is only part of 
Gillies’s legacy. His creation of the Queen’s 
Hospital at Sidcup in 1917—the first hospital 
dedicated to facial reconstruction—was dis- 
tinguished by its multispecialty staff, which 
created a fertile climate for innovation and 
collaboration. He employed artists and pho- 
tographers who could document operative 
technique and the before-and-after rendi- 
tions of surgery, dentists conversant in jaw 
anatomy and physiology, caring nurses and 
orderlies, and even a barber trained to shave 
faces marked by scars and missing flesh. 


The Facemaker: A Visionary 
Surgeon’s Battle to 

Mend the Disfigured 
Soldiers of World War | 
Lindsey Fitzharris 

Farrar, Straus and Giroux, 
2022. 336 pp. 


To counter the book’s unsparing im- 
mersion in the atrocities of war, Fitzharris 
presents vivid depictions of humanity that 
act as a profound salve. A nurse prepares 
a glass of warm milk for a hospitalized 
patient. A woman looks past a soldier’s 
disfigurement and falls in love. Gillies 

meticulously plans each move in the 
operating room, committed to the res- 
toration of face and soul, agonizing 
when things go wrong. 

Interwoven through Fitzharris’s story 
is the irony of Gillies’s efforts, as encap- 
sulated in an exchange between a bacte- 
riologist and asurgeon in a London hotel 
lobby overheard by journalist Harold 
Begbie. “Here we are, you and I, whose 
business it is to save life, in the midst 
of men whose business it is to destroy 
life,” the bacteriologist is said to have 
remarked. There was one bright spot, 
however, as surgeon Fred Albee ob- 
served: “That in the long run, human- 
ity would benefit from the knowledge 
surgeons had gained in time of war.” 

Indeed, this would prove to be the 
case. In addition to pioneering work 
in reconstructive plastic surgery, bat- 
tlefield medicine also led to improve- 
ments in orthopedic surgery, blood 
transfusions, and trauma surgery. 

After the war, Gillies became the 
first surgeon to construct a penis for a 
trans man, thereby pioneering gender 
affirmation surgery. “The world,” his 

patient wrote after the successful phallo- 
plasty, “began to seem worth living in.” De- 
spite negative feedback from some of his 
peers, Gillies understood that the patient’s 
needs came first. 

“The practice of medicine is an art, 
not a trade; a calling...in which your heart will 
be exercised equally with your head,’ William 
Osler, the father of American medicine, once 
said. As Fitzharris so elegantly reveals, Har- 
old Gillies embodied this sentiment. & 
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New treaty must address 
ghost fishing gear 


In his News story “World’s nations start to 
hammer out first global treaty on plastic 
pollution” (23 February, https://scim.ag/ 
unplastictreaty), E. Stokstad discusses the 
issues that may be addressed by a new 
plastic treaty (1), including pollution result- 
ing from fishing activities. Because fish- 
ing gear is often made from long-lasting 
synthetic polymers, such as nylon (2), lost 
and abandoned gear is a long-term prob- 
lem. This type of pollution, known as ghost 
gear, is a serious and pervasive threat to 
the integrity of ecosystems (2). The first 
plastic treaty must address ghost gear in 
marine (3) and freshwater environments. 

Ghost gear affects aquatic ecosystems on 
every continent. Abandoned or lost nets, for 
example, trap and often kill large fish (e.g., 
elasmobranchs), crustaceans (decapods), 
turtles, mammals (including cetaceans), and 
other organisms (4-7). Although reports 
are more frequent from marine ecosystems, 
damage has occurred in inland water eco- 
systems as well (2, 7). Other animals, such 
as birds, are attracted to potential prey 
trapped in the ghost gear and can become 
entangled themselves (5, 8), generating 
a negative cascade effect (5). As Stokstad 
notes, the problem is exacerbated by the 
lack of reliable data on the frequency and 
degree of impact of ghost gear in aquatic 
ecosystems around the world. 

Given the increasing demand for 
resources to feed the world’s growing 
population, fishing will intensify in coming 
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years (3, 9), and the amount of ghost gear 
in aquatic ecosystems will almost certainly 
increase as a result. To address this prob- 
lem, the plastic treaty should aim to reduce 
the risk fishing gear poses to the environ- 
ment. Possible strategies include replacing 
synthetic fishing gear with biodegradable 
alternatives, which are already available 
(10); limiting the sales of nylon nets; 
providing educational opportunities; and 
removing lost and abandoned fishing gear 
from ecosystems (2). In addition to draft- 
ing the plastic treaty, all countries must 
take urgent and comprehensive action 

to combat the harm caused by fishing 
activities. 
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Explanations for 
nitrogen decline 


In their Review “Evidence, causes, and 
consequences of declining nitrogen avail- 
ability in terrestrial ecosystems” (15 April, 
eabh3767), R. E. Mason et al. argue that 
nitrogen has decreased in availability 
worldwide over the past century and that 
the decline is best explained by human- 
driven elevated temperatures and CO,. This 
conclusion conflicts with previous studies 
showing strong increases in nitrogen avail- 
ability compared to preindustrial levels 

(1, 2). Mason et al. present two main types 
of observational trends as evidence that 
nitrogen has declined: a decline in Europe 
and the United States since 1990 in various 
nitrogen availability indices, and a world- 
wide decline of nitrogen isotope ratios 
(8"N) in plant leaves, tree rings, and lake 
sediments since 1920. We disagree that 
rising temperatures and CO, levels are the 
best explanation for these trends. 

The decline in nitrogen since 1990 can 
be easily explained by reduced nitrogen 
emissions from fossil fuels and agriculture 
since 1990 in Europe and the United States 
(3). However, because nitrogen emissions 
remain far above preindustrial levels, high 
levels of nitrogen inputs in ecosystems 
continue to cause nitrogen eutrophication 
and biodiversity loss (4). The second trend 
can be explained by the human-driven 
shift since 1920 toward a much larger role 
of gaseous sources of reactive nitrogen in 
the global nitrogen cycle relative to direct 
uptake from soils and recycled residues 
(, 4). Increasing numbers of livestock, 
the urine and feces of which contain 
nitrogen that forms ammonia (NH,), have 
led to increased release of this reactive 
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nitrogen-containing gas into the atmo- 
sphere (a process known as volatilization). 
Artificial nitrogenous fertilizers, which are 
widely produced from nonreactive nitrogen 
gas (N,), have also increased volatilization 
of nitrogen as ammonia (5). Compared with 
nitrogen released through organic matter 
decomposition in soils, these gaseous ori- 
gins of reactive nitrogen are typically more 
depleted in the stable isotope "N (J, 6, 7). 
The marked "N depletion in plants 
in natural ecosystems over the past cen- 
tury likely reflects these much-increased 
anthropogenic nitrogen emissions and 
gases (6, 8, 9) rather than lower nitro- 
gen availability as Mason et al. suggest. 
Therefore, we caution against Mason et al’s 
recommendation to fertilize seminatural 
ecosystems with nitrogen to improve car- 
bon sequestration. To prevent the negative 
effects of excess nitrogen (such as biodiver- 
sity loss), implementing this intervention 
should wait until more compelling evi- 
dence is available. 
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Response 

Olff et al. select only a subset of the evi- 
dence for declining nitrogen availability 
and assign unlikely mechanisms to reach 
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the conclusion that nitrogen availability is 
not declining over large areas of Earth. We 
disagree that the evidence can be grouped 
into the categories that Olff et al. describe; 
the complete set of observations is wider 
in scope and cannot be explained by the 
mechanisms that the authors propose. 

Olff et al. claim that declines in nitro- 
gen emissions since 1990 can explain 
declining nitrogen availability. Our 
Review acknowledges reduced emis- 
sions, and the resulting reduction in 
atmospheric deposition of nitrogen onto 
ecosystems, as a likely contributing fac- 
tor. However, we also present long-term 
records of declining nitrogen availability, 
including declining nitrogen concentra- 
tions in plant leaves since around 1930 
(1, 2) and in plant pollen since the early 
1900s (3), as well as declines in a broad 
suite of soil nitrogen availability indica- 
tors and stream water NO, at Hubbard 
Brook in New Hampshire, United States, 
that date back to the 1960s and 1970s (4, 
5). These observations predate reductions 
in nitrogen deposition. Moreover, as we 
explain in the Review, declines in nitrogen 
availability indicators have occurred in 
places that have never experienced sub- 
stantially elevated nitrogen deposition (1) 
and alongside declines in concentrations 
of other elements in plants (6-8). 

Olff et al. then propose that large-scale 
declines in natural abundance nitrogen 
isotope ratio (5°N) values in sediment and 
plants can be explained by a change over 
time in the isotopic signature of anthropo- 
genic nitrogen emissions toward isotopically 
lighter, reduced forms of nitrogen. However, 
the evidence they cite of possible effects 
of this shift on plant 5°N refers only to a 
handful of case studies in atypical environ- 
ments (9-11). The isotopic ratio of deposited 
nitrogen is elevated by processes in soil that 
discriminate against °N; the effects of such 
processes increase with increasing nitrogen 
supply (2, 12). Models show that the isotopic 
signature of deposited nitrogen would have 
to be implausibly low to cause plant 5°N to 
decline at the observed rate (2). 

There is little doubt that massive and 
poorly managed anthropogenic nitrogen 
inputs have led to eutrophication and bio- 
diversity loss in many locations. However, 
rising atmospheric CO,, warming, and sev- 
eral other global changes are concurrently 
driving a reduction in nitrogen availability 
(i.e., nitrogen supply relative to nitrogen 
demand). The well-documented increases 
in anthropogenic nitrogen supply noted by 
Olff et al. have not affected global ecosys- 
tems uniformly and are unlikely to be the 
overriding driver of changes in nitrogen 
availability across all terrestrial ecosystems. 


As we state in our Review, the fundamen- 
tal response to declining nitrogen avail- 
ability must be to reduce CO, emissions. 
We point out that, although fertilization 
may be one option for increasing nitrogen 
availability to plants, microbes, and her- 
bivores, numerous factors must be taken 
into account when designing interventions 
that can achieve well-defined goals with- 
out unacceptable negative consequences. 
Further work is necessary to more fully 
demonstrate the extent of declines in nitro- 
gen availability, to clarify the underlying 
mechanisms, and to delineate appropriate 
responses. But before this can happen, the 
scientific evidence for declining nitrogen 
availability must be acknowledged. 
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PLANT SCIENCE 

Host plant nurtures fungus 


ome fungi depend on their living hosts for 
sustenance. The corn smut fungus Ustilag 
maydis can grow independently but 
depends on the host maize plant to repro- 
duce. Kretschmer et a/. analyzed which 
host nutrients are required to support this obli- 
gate biotroph’s lifestyle. The fungus responds 
to a combination of nutrients, including organic 
acids such as malate, which maize uses as a 
substrate for C4 photosynthesis. Identification 
of dicarboxylate transporters showed that the 
ability of the fungus to draw these organic acid 
out of the host plant contributes to the patho- 
gen’s virulence. With such nutrition ensured, 
the fungus can then move through its life cycle. 


PHOTO: ARTMARI/SHUTTERSTOCK 


More predation in 
warmer seas 


Species richness of many taxa 

is higher near the equator, and 
ecologists have long hypoth- 
esized that this pattern is linked 
to stronger interactions between 
species (e.g., competition and 
predation) in the tropics. However, 
empirical evidence showing that 
the strength of species interac- 
tions varies with latitude is limited. 
Ashton et al. tested whether 
predation on benthic marine 
communities is higher at lower 
latitudes. Using a standardized 
experiment at 36 sites along the 
Pacific and Atlantic coasts of 
North and South America, the 
authors found both greater preda- 
tion intensity (consumption rate) 
and stronger impacts on benthic 
communities nearer the equator. 
These trends were more strongly 
related to water temperature 


SCIENCE science.org 


than to latitude, suggesting that 
climate warming may influence 
top-down control of communi- 
ties. —BEL 

Science, abc4916, this issue p. 1215 


Establishing early 
diversity 


Humans living an urbanized 
lifestyle in industrialized countries 
tend to have less diverse micro- 
biota than people living more rural 
existences. Using fecal 16S ribo- 
somal RNA sequencing, Olm et al. 
found that after the first 6 months 
of life, the microbiome of infants 
iving in contrasting environments 
diverged from Bifidobacteria- 
dominated assemblages. Deep 
metagenomic sequencing 
revealed that a large proportion 

of the bacterial species detected 
in samples from hunter-gatherer 
infants were new and were 


—PJH Science, abo2401, this issue p. 1187 


Corn smut fungus senses nutrients in its maize 
host that trigger its reproductive cycle. 


undetectable in samples from 
urbanized children. Gut micro- 
biota diversity appears early in the 
lives of hunter-gatherer infants 
and is traceable to maternal 
transmission, with some influence 
from the local environment. The 
main driver for differences among 
gut microbiota originates in life- 
style rather than geography. It is 
suspected, but still enigmatic, that 
such differences in microbiota 
have functional implications for 
the health of developing children. 
—CA 

Science, abj2972, this issue p. 1220 


Catalysis paired with 
crystallization 


Asymmetric catalysis often 
distinguishes mirror-image 
configurations at a single carbon 
center. However, many complex 
molecules have three or more 
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chiral centers, and selecting just 
one of the ballooning number 
of diastereomers in such cases 
can be daunting. In this context, 
de Jesus Cruz et al. report that 
product crystallization during 
the reaction can supplement the 
catalyst's intrinsic selectivity. 
Specifically, the authors used a 
chiral base to set one stereo- 
center in a Michael addition of 
nitroalkanes to ketoamides while 
dynamically scrambling the 
configurations on the adjacent 
carbons. Crystallization then 
selects a single diastereomer from 
this interconverting mixture. —JSY 
Science, abo5048, this issue p. 1224 


Timing eating and fasting 
for longevity 


Animals fed a limited number of 
calories, just enough to avoid mal- 
nutrition, show extended health 
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span and life span. However, they 
are so hungry that they eat those 
fewer calories in a limited period 
of time and consequently spend 
more time fasting than do ani- 
mals for which access to food is 
not restricted. Acosta-Rodriguez 
et al. therefore designed experi- 
ments in mice to control both 
caloric intake and the timing of 
their eating to see which factors 
were the most important (see the 
Perspective by Deota and Panda). 
Caloric restriction extended life 
span as expected, but it worked 
best when feeding was restricted 
so that the animals fasted for 
at least 12 hours and when the 
period in which the animals ate 
corresponded to the active phase 
of their circadian cycle. —LBR 
Science, abk0297, this issue p. 1192; 
see also adc8824, p. 1159 


Patterns and process in 
RNA viruses 


Viruses are suspected to be 
lynchpins in ecosystem function, 
but so far we can only guess at 
their significance. DNA viruses are 
increasingly being recognized as 
significant components of biogeo- 
chemical cycling in the oceans. 
Dominguez-Huerta et a/. explored 
global patterns of marine RNA 
virus occurrence by extracting 
virus sequences from Tara Ocean 
samples. Host prediction analysis 
identified predominantly protist 
and fungal hosts plus a few inver- 
tebrates. Like double-stranded 


DNA viruses and their hosts, RNA 
viruses Showed marked depth 
limitation but little latitudinal 
change. Auxiliary metabolic 
genes in the RNA virome indi- 
cated that several eukaryote 
plankton processes are affected 
by viruses. A group of 11 RNA 
viruses that significantly influ- 
ence ocean carbon flux were 
identified. —CA 

Science, abn6358, this issue p. 1202 


Big step forward for 


Parkinson’s disease 
Inhibition of the kinase LRRK2 
has emerged as a promising 
disease-modifying therapeutic 
target for Parkinson's disease. 
Jennings et al. report evidence 
that DNL201, a first-in-class 
central nervous system—pen- 
etrant LRRK2 kinase inhibitor, 
reduces LRRK2 activity and 
restores lysosomal function in 
cellular and animal models. In 
healthy volunteers and patients 
with Parkinson's disease, 
DNL201 inhibited LRRK2 kinase 
activity and demonstrated an 
impact on lysosomal function 
at doses that were safe and 
generally well tolerated. These 
findings provide support for 
advancing the investigation of 
LRRkK2 inhibitors to late-stage 
clinical studies in patients with 
Parkinson's disease (see the 
Focus by Lewis). -OMS 

Sci. Transl. Med. 14, eabj2658 (2022); 

see also abq7374 (2022). 


Surveys from the Tara Ocean's research vessel, pictured here, provide insight into 
global RNA virus abundance and distribution. 
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Distant access to 


abortion 


The US Supreme Court is 
considering overturning Roe v 
Wade, a 1973 ruling that legalized 
abortion across the country. 

If overturned, the decision to 
maintain this essential health 
service will be left to individual 
states. One consequence for 
patients living in southern 
conservative states will be 
long-distance travel to abortion 
facilities in more liberal northern 
and western states. Pleasants et 
al. used the Google Ads Abortion 
Access Study to examine preg- 
nancy outcomes for individuals 
considering an abortion. At 
4-week follow-up, women who 
lived farther from an abortion 
facility (50+ miles) had signifi- 
cantly higher odds of still being 
pregnant because of the prohibi- 
tive travel costs of traveling to 
aclinic. A lack of local abortion 
access could be far-reaching for 
patients with underlying medical 
conditions, and others will suffer 
from the economic and mental 
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hed 


health burdens of continuing 
unwanted pregnancies. —EEU 
JAMA Netw. Open 5, 2212065 
(2022). 


Fast heat 


Mammals, including humans, 
have two types of body fat 
named after their color: white 
fat and brown fat. Most fat in 
humans is white fat, which stores 
energy. By contrast, brown fat 
breaks down sugar and lipids 

to generate heat when we are 
exposed to cold temperatures. 
The thermogenic process trig- 
gers rapidly and happens partly 
through the activation of a gene 
called uncoupling protein 1 
(Ucp1), which uncouples protons 
moving down a mitochondrial 
gradient from ATP synthesis, 
thus allowing the energy to be 
dissipated as heat. Wang et al. 
have evidence to explain why 
brown fat can produce heat so 
quickly. A protein called DDB1 
occupies the promoters of many 
thermogenic genes, including 
Ucp1, and releases a brake on 
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QUANTUM COMPUTING 
Learning from quantum 
experiments 


There is considerable interest in 
extending the recent success of 
quantum computers in out- 
performing their conventional 
classical counterparts (quantum 
advantage) from some model 
athematical problems to more 
eaningful tasks. Huang et al. 
how how manipulating multiple 
uantum states can provide 
n exponential advantage over 
lassical processing of measure- 
ments of single-quantum states 
for certain learning tasks. These 
nclude predicting properties of 
physical systems, performing 
quantum principal component 
analysis on noisy states, and 
learning approximate models 
of physical dynamics (see the 
Perspective by Dunjko). In 
their proof-of-principle experi- 
ments using up to 40 qubits on 
a Google Sycamore quantum 
processor, the authors achieved 
almost four orders of magnitude 
of reduction in the required 
number of experiments over 
the best-known classical lower 
bounds. —YS 

Science, abn7293, this issue p. 1182; 

see also abp9885, p. 1154 


omoaun3yg3 


QUANTUM SIMULATION 
Solving hard graph 
problems 


Realizing quantum speedup for 
solving practical, computation- 
ally hard problems is the central 
challenge in quantum informa- 
tion science. Ebadi et a/. used 
Rydberg atom arrays composed 
of up to 289 coupled qubits 

in two spatial dimensions to 
investigate quantum optimiza- 
tion algorithms for solving the 
maximum independent set, a 
paradigmatic nondeterministic 
polynomial time—hard combina- 
torial optimization problem (see 
the Perspective by Schleier- 
Smith). A hardware-efficient 
encoding protocol associated 
with Rydberg blockade was 


used to realize a closed-loop 
optimization method to test 
several variational algorithms 
and subsequently apply them to 
systematically explore a class of 
nonplanar graphs with program- 
mable connectivity. The results 
demonstrate the potential of 
quantum machines as a tool for 
the discovery of new promising 
algorithm classes. —ISO 

Science, abo6587, this issue p. 1209; 

see also abq3754, p. 1155 


PHOSPHORUS CHEMISTRY 
Chiral-at-P products via 
H-bond catalysis 


Hydrogen (H)—bonding catalysis 
has recently proven useful for 
activating carbon-chlorine bonds 
to form just one of two possible 
mirror-image products. Forbes 
and Jacobsen now extend this 
approach to desymmetriza- 
tion of phosphorus(V) [P(V)] 
dichloride compounds (see 
the Perspective by Verdaguer). 
Using chiral urea catalysts, the 
authors could displace just one 
of two chlorides with an amine, 
thereby producing a versatile 
P(V) intermediate. Subsequent 
selective displacement of the 
remaining chloride and/or amine 
offers access to a wide range of 
chiral-at-P compounds, a class 
of increasing pharmaceutical 
interest. —JSY 

Science, abp8488, this issue p. 1230; 

see also abq5073, p. 1157 


PROTEIN BIOPHYSICS 
Stabilizing receptors with 
cholesterol 


Membrane-embedded G pro- 
tein—coupled receptors such 

as the B,-adrenergic receptor 
(B,AR) mingle with cholesterol, 
which modulates their assembly 
and stability. Serdiuk et al. exam- 
ined the effects of a cholesterol 
analog on the mechanical and 
energetic properties of the B,AR 
at different temperatures. The 
authors showed that whereas 
the presence of the cholesterol 
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analog in liposomes had minimal 
effects at 25° and 42°C, it stabi- 
lized subsets of conformations 
of the BAR that were important 
for basal activity at 37°C. —JFF 


Sci. Signal. 15, eabi7031 (2022). 


CANCER IMMUNOLOGY 


ILC2s help tumors 
Group 2 innate lymphoid cells 
(ILC2s) protect against parasitic 
infections and are implicated 
in allergies, yet their role in 
antitumor immunity remains 
unclear. Jou et al. used vari- 
ous genetic mouse models to 
study the roles of ILC2s in 
colorectal cancer (CRC). They 
found that ILC2s were linked to 
an immunosuppressive tumor 
microenvironment, where dele- 
tion of these cells led to less CRC 
tumor burden. ILC2s specifically 
responded to high interleu- 
kin-25 (IL-25) expression in CRC 
tumors, and in turn inducing 
immunosuppressive myeloid- 
derived suppressor cells. 
Therapeutically blocking the 
IL-25 receptor on ILC2s lowered 
tumor burden and led to more 
favorable antitumor immune 
responses. —DAE 

Sci. !mmunol. 7, eabn0175 (2022). 
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span and life span. However, they 
are so hungry that they eat those 
fewer calories in a limited period 
of time and consequently spend 
more time fasting than do ani- 
mals for which access to food is 
not restricted. Acosta-Rodriguez 
et al. therefore designed experi- 
ments in mice to control both 
caloric intake and the timing of 
their eating to see which factors 
were the most important (see the 
Perspective by Deota and Panda). 
Caloric restriction extended life 
span as expected, but it worked 
best when feeding was restricted 
so that the animals fasted for 
at least 12 hours and when the 
period in which the animals ate 
corresponded to the active phase 
of their circadian cycle. —LBR 
Science, abk0297, this issue p. 1192; 
see also adc8824, p. 1159 


Patterns and process in 
RNA viruses 


Viruses are suspected to be 
lynchpins in ecosystem function, 
but so far we can only guess at 
their significance. DNA viruses are 
increasingly being recognized as 
significant components of biogeo- 
chemical cycling in the oceans. 
Dominguez-Huerta et a/. explored 
global patterns of marine RNA 
virus occurrence by extracting 
virus sequences from Tara Ocean 
samples. Host prediction analysis 
identified predominantly protist 
and fungal hosts plus a few inver- 
tebrates. Like double-stranded 


DNA viruses and their hosts, RNA 
viruses Showed marked depth 
limitation but little latitudinal 
change. Auxiliary metabolic 
genes in the RNA virome indi- 
cated that several eukaryote 
plankton processes are affected 
by viruses. A group of 11 RNA 
viruses that significantly influ- 
ence ocean carbon flux were 
identified. —CA 

Science, abn6358, this issue p. 1202 


Big step forward for 


Parkinson’s disease 
Inhibition of the kinase LRRK2 
has emerged as a promising 
disease-modifying therapeutic 
target for Parkinson's disease. 
Jennings et al. report evidence 
that DNL201, a first-in-class 
central nervous system—pen- 
etrant LRRK2 kinase inhibitor, 
reduces LRRK2 activity and 
restores lysosomal function in 
cellular and animal models. In 
healthy volunteers and patients 
with Parkinson's disease, 
DNL201 inhibited LRRK2 kinase 
activity and demonstrated an 
impact on lysosomal function 
at doses that were safe and 
generally well tolerated. These 
findings provide support for 
advancing the investigation of 
LRRkK2 inhibitors to late-stage 
clinical studies in patients with 
Parkinson's disease (see the 
Focus by Lewis). -OMS 

Sci. Transl. Med. 14, eabj2658 (2022); 

see also abq7374 (2022). 


Surveys from the Tara Ocean's research vessel, pictured here, provide insight into 
global RNA virus abundance and distribution. 
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Distant access to 


abortion 


The US Supreme Court is 
considering overturning Roe v 
Wade, a 1973 ruling that legalized 
abortion across the country. 

If overturned, the decision to 
maintain this essential health 
service will be left to individual 
states. One consequence for 
patients living in southern 
conservative states will be 
long-distance travel to abortion 
facilities in more liberal northern 
and western states. Pleasants et 
al. used the Google Ads Abortion 
Access Study to examine preg- 
nancy outcomes for individuals 
considering an abortion. At 
4-week follow-up, women who 
lived farther from an abortion 
facility (50+ miles) had signifi- 
cantly higher odds of still being 
pregnant because of the prohibi- 
tive travel costs of traveling to 
aclinic. A lack of local abortion 
access could be far-reaching for 
patients with underlying medical 
conditions, and others will suffer 
from the economic and mental 
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health burdens of continuing 
unwanted pregnancies. —EEU 
JAMA Netw. Open 5, 2212065 
(2022). 


Fast heat 


Mammals, including humans, 
have two types of body fat 
named after their color: white 
fat and brown fat. Most fat in 
humans is white fat, which stores 
energy. By contrast, brown fat 
breaks down sugar and lipids 

to generate heat when we are 
exposed to cold temperatures. 
The thermogenic process trig- 
gers rapidly and happens partly 
through the activation of a gene 
called uncoupling protein 1 
(Ucp1), which uncouples protons 
moving down a mitochondrial 
gradient from ATP synthesis, 
thus allowing the energy to be 
dissipated as heat. Wang et al. 
have evidence to explain why 
brown fat can produce heat so 
quickly. A protein called DDB1 
occupies the promoters of many 
thermogenic genes, including 
Ucp1, and releases a brake on 
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EVOLUTION 


Cetacean skulls from 
land to ocean 


the transcription of these genes, 
allowing them to be expressed. 
Increased understanding of the 
mechanism is important because 
brown fat thermogenesis is 
crucial not only for maintaining 
core body temperature but also 
for energy balance. Some studies 
have explored the potential of 
brown fat in treating obesity. —DJ 
Life Metab. 10.1093/ 
lifemeta/loacO003 (2022). 


NEUROSCIENCE 
Flexible learning in the 
cerebellum 


The cerebellum is known to 
participate in the acquisi- 

tion of motor skills and motor 
learning, but it also processes 
reward-related information. 

The interaction between these 
pathways and their contribution 
to learning is not understood. 
Sendhilnathan et al. compared 
the activity of cerebellar Purkinje 
cells while monkeys were actively 
learning new visuomotor associa- 
tions and while they performed 
an already familiar visuomotor 


SCIENCE science.org 


association task. These cells pro- 
cess two different signals when 
monkeys learn a new arbitrary 
stimulus—response association. 
One is a previously described 
reinforcement learning error 
signal, and the second, described 
here, is a signal that describes the 
state of learning. —PRS 

J. Neurosci. 42, 3847 (2022). 


VOLCANOLOGY 
Eruption forerunners 


Volcanic eruptions often have 
different sorts of precursors 
before the main event that range 
from a change in the number of 
small earthquakes to the compo- 
sition of gases from fumaroles. 
Flévenz et al. show that pre- 
eruptive uplift cycles occurred in 
a nearby geothermal field before 
the March 2021 Fagradalsfjall 
eruption in Iceland. The changes 
were likely due to fluid migra- 
tion from the magma body 

and occurred up to a year and 

a half before the eruption. The 
observations help provide differ- 
ent constraints on parts of the 
complex process that can lead to 


a subaquatic lifestyle. 


a volcanic eruption. —BG 
Nat. Geosci. 15, 397 (2022). 


NONLINEAR OPTICS 
Relaxing constraints for 
phase matching 


The interactions of light in non- 
linear optical materials produce 
a number of effects, including 
lasing, frequency conversion, 
high-harmonic generation, and 
spontaneous downconversion. 
Such effects find application 

in microscopy, optical com- 
munication networks, and 
quantum optics. Underlying 
these effects is phase matching 
of the interacting light beams, 
which requires strict experi- 
mental conditions to be met, 
often leading to cumbersome 
setups. Gagnon et al. show 

that metamaterials engineered 
with a low refractive index can 
relax the constraints for phase 
matching. Their demonstration 
of direction-independent four- 
wave mixing in a nanophotonic 
structure illustrates how such 
low-index materials can be used 
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etaceans have populated the ocean to 

become charismatic subjects of human 

legend, but it has taken 50 million years for 

whales and dolphins to evolve from terres- 

trial mammals into an array of specialized 
aquatic species. Their shift to water required 
enormous physiological and physical adaptations. 
Specializations such as alterations in the posterior 
skull to accommodate breathing at the water’s 
surface, baleen for filter feeding, and biosonar for 
echolocation are hallmarks of cetacean adaptations. 
Coombs et al. analyzed in three dimensions the 
shapes of skulls from hundreds of living and extinct 
cetacean species. Their results show that rapid 
evolution during the Eocene reshaped the front of 
the skull. Further diversification of skull morphology 
continued as cetaceans evolved to accommodate 
new diets and feeding methods, as well as echoloca- 
tion. —PJH Curr. Biol. 32, 2233 (2022). 


Since the Eocene, when their ancestors re-entered 
the sea, skulls of cetaceans evolved rapidly to adapt to 


to miniaturize nonlinear optical 
devices. — ISO 
Phys. Rev. Lett. 128, 203902 (2022). 


CHEMISTRY 
Molecules work on 
self-control 


Oscillation of a chemical system 
between two or more states 
underlies many important pro- 
cesses in biology and may bea 
useful model for understanding 
how early cells formed and func- 
tioned. Howlett et al. designed a 
model supramolecular system 
that achieves a rudimentary 
metabolism: oscillatory self- 
replication of a metastable 
micelle powered by a chemical 
fuel. The micelle contains a disul- 
fide that can cross-react with 
other components, leading to 
dispersal and eventual regenera- 
tion. The addition of a dye that is 
incorporated into the transient 
micelle provides a visual readout 
of the supramolecular oscilla- 
tions. —MAF 
Nat. Chem. 10.1038/ 
s41557-022-00949-6 (2022). 
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STRUCTURE OF THE NUCLEAR PORE 


NUCLEAR PORE COMPLEX 


RESEARCH ARTICLE SUMMARY 


Structure of the cytoplasmic ring of the Xenopus 


laevis nuclear pore complex 


Xuechen Zhu, Gaoxingyu Huang*}, Chao Zeng}, Xiechao Zhan, Ke Liang}, Qikui Xu, Yanyu Zhao, 
Pan Wang, Qifan Wang, Qiang Zhou, Qinghua Tao, Minhao Liu, Jianlin Lei, 


Chuangye Yan, Yigong Shi* 


INTRODUCTION: The nuclear pore complex (NPC) 
resides on the nuclear envelope (NE) and me- 
diates nucleocytoplasmic cargo transport. As 
one of the largest cellular machineries, a ver- 
tebrate NPC consists of cytoplasmic filaments, 
a cytoplasmic ring (CR), an inner ring, a nuclear 
ring, a nuclear basket, and a luminal ring. Each 
NPC has eight repeating subunits. Structure de- 
termination of NPC is a prerequisite for under- 
standing its functional mechanism. In the past 
two decades, integrative modeling, which com- 
bines x-ray structures of individual nucleoporins 
and subcomplexes with cryo-electron tomogra- 
phy reconstructions, has played a crucial role in 
advancing our knowledge about the NPC. 

The CR has been a major focus of structural 
investigation. The CR subunit of human NPC 
was reconstructed by cryo-electron tomography 
through subtomogram averaging to an overall 
resolution of ~20 A, with local resolution up to 
~15 A. Each CR subunit comprises two Y-shaped 
multicomponent complexes known as the inner 
and outer Y complexes. Eight inner and eight 
outer Y complexes assemble in a head-to-tail 
fashion to form the proximal and distal rings, 
respectively, constituting the CR scaffold. To 


Cytoplasmic side 


achieve higher resolution of the CR, we used 
single-particle cryo-electron microscopy (cryo- 
EM) to image the intact NPC from the NE of 
Xenopus laevis oocytes. Reconstructions of the 
core region and the Nup358 region of the 
X. laevis CR subunit had been achieved at 
average resolutions of 5 to 8 A, allowing iden- 
tification of secondary structural elements. 


RATIONALE: Packing interactions among the 
components of the CR subunit were poorly 
defined by all previous EM maps. Additional 
components of the CR subunit are strongly 
suggested by the EM maps of 5- to 8-A res- 
olution but remain to be identified. Address- 
ing these issues requires improved resolution 
of the cryo-EM reconstruction. Therefore, we 
may need to enhance sample preparation, 
optimize image acquisition, and develop an 
effective data-processing strategy. 


RESULTS: To reduce conformational heteroge- 
neity of the sample, we spread the opened NE 
onto the grids with minimal force and used 
the chemical cross-linker glutaraldehyde to sta- 
bilize the NPC. To alleviate orientation bias of 


@Nup205 (2) 
@Nup93 (2) 
@Nup358 (5) 

@ Outer Y complex 
@ Inner Y complex 


Cryo-EM structure of the double-layered CR of the X. laevis NPC. The X. laevis CR, containing eight 
repeating subunits, is modeled on the basis of cryo-EM reconstructions (top left panel). One CR subunit is 
shown in two different views to highlight nucleoporins of key interest (bottom left and right panels). The inner 
and outer Y complexes are colored dark and light gray, respectively. Two Nup205, two Nup93, and five 
Nup358 molecules are colored blue, red, and purple, respectively. 


Zhu et al., Science 376, 1177 (2022) 10 June 2022 


the NPC, we tilted sample grids and imaged the 
sample with higher electron dose at higher 
angles. We improved the image-processing 
protocol. With these efforts, the average res- 
olutions for the core and the Nup358 regions 
have been improved to 3.7 and 4.7 A, respec- 
tively. The highest local resolution of the core 
region reaches 3.3 A. In addition, a cryo-EM 
structure of the N-terminal o-helical domain of 
Nup358 has been resolved at 3.0-A resolution. 
These EM maps allow the identification of five 
copies of Nup358, two copies of Nup93, two 
copies of Nup205, and two copies of Y com- 
plexes in each CR subunit. Relying on the EM 
maps and facilitated by AlphaFold prediction, 
we have generated a final model for the CR of 
the X. laevis NPC. Our model of the CR subunit 
includes 19,037 amino acids in 30 nucleoporins. 

A previously unknown C-terminal fragment 
of Nup160 was found to constitute a key part of 
the vertex, in which the short arm, long arm, 
and stem of the Y complex meet. The Nup160 
C-terminal fragment directly binds the B-propeller 
proteins Seh1 and Secl3. Two Nup205 mole- 
cules, which do not contact each other, bind 
the inner and outer Y complexes through dis- 
tinct interfaces. Conformational elasticity of 
the two Nup205 molecules may underlie their 
versatility in binding to different nucleoporins 
in the proximal and distal CR rings. Two Nup93 
molecules, each comprising an N-terminal ex- 
tended helix and an ACE] domain, bridge the Y 
complexes and Nup205. Nup93 and Nup205 
together play a critical role in mediating the 
contacts between neighboring CR subunits. 
Five Nup358 molecules, each in the shape of a 
shrimp tail and named “the clamp,” hold the 
stems of both Y complexes. The innate con- 
formational elasticity allows each Nup358 clamp 
to adapt to a distinct local environment for 
optimal interactions with neighboring nucle- 
oporins. In each CR subunit, the o-helical 
nucleoporins appear to provide the confor- 
mational elasticity; the 12 B-propellers may 
strengthen the scaffold. 


CONCLUSION: Our EM map-based model of 
the X. laevis CR subunit substantially ex- 
pands the molecular mass over the reported 
composite models of vertebrate CR subunit. In 
addition to the Y complexes, five Nup358, two 
Nup205, and two Nup93 molecules constitute 
the key components of the CR. The improved 
EM maps reveal insights into the interfaces 
among the nucleoporins of the CR. 
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Structure of the cytoplasmic ring of the Xenopus 
laevis nuclear pore complex 
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The nuclear pore complex (NPC) mediates nucleocytoplasmic cargo transport. Here, we present a single- 
particle cryo-electron microscopy reconstruction of the cytoplasmic ring (CR) subunit from the Xenopus laevis 
NPC at 3.7- to 4.7-angstrom resolution. The structure of an amino-terminal domain of Nup358 has been 
resolved at 3.0 angstroms, facilitating the identification of five Nup358 molecules in each CR subunit. Our final 
model of the CR subunit included five Nup358, two Nup205, and two Nup93 molecules in addition to the two 
previously characterized Y complexes. The carboxy-terminal fragment of Nup160 served as an organizing 
center at the vertex of each Y complex. Structural analysis revealed how Nup93, Nup205, and Nup358 facilitate 
and strengthen the assembly of the CR scaffold that is primarily formed by two layers of Y complexes. 


he nuclear pore complex (NPC) resides 
on the nuclear envelope (NE) and medi- 
ates nucleocytoplasmic cargo transport 
(1-3). As one of the largest cellular ma- 
chineries (4-6), a vertebrate NPC has a 
molecular mass of >100 MDa and consists of 
multiple cytoplasmic filaments (CF), a cytoplasmic 
ring (CR), an inner ring (IR), a nuclear ring (NR), 
a nuclear basket (NB), and a luminal ring (LR) 
(5-9) (Fig. 1A). Each NPC has eight repeating 
subunits known as the spokes (7). Our present 
study concerns the structure of the CR constitu- 
ents in each spoke, referred to as the CR subunits. 
For structural elucidation of the NPC, cryo- 
electron tomography used to be the main- 
stream approach. The CR subunit of the human 
NPC was reconstructed through subtomogram 
averaging to a highest resolution of ~15 A (JO). 
Each CR subunit is featured with two Y-shaped 
multicomponent complexes known as the inner 
and the outer Y complexes. Sixteen Y complexes, 
eight inner and eight outer, in the CR assemble 
in a head-to-tail fashion to build the scaffold in 
the form of the proximal and distal concentric 
ring, respectively (Fig. 1, A and B) (11, 12). The 
tripartite Y complex consists of a short arm 
(Nup85, Nup43, and Seh1), a long arm (Nup160 
and Nup37), and a stem (Nup96, Sec13, Nup107, 
and Nup133) (11, 13) (Fig. 1, A and D). 
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To achieve higher resolution of the CR as- 
sembly, we used single-particle cryo-electron 
microscopy (cryo-EM) reconstruction to exam- 
ine the CR subunit from the Xenopus laevis 
NPC. During data processing, two relatively 
independent and rigid structural entities stood 
out, designated as the core region and the 
Nup358 region (74). The core region contains 
the long arm from the inner Y complex and 
the short arms from both Y complexes, and the 
Nup358 region covers the stems of the two Y 
complexes (Fig. 1B). An overall resolution of 
5 to 8 A was achieved after masked refinement 
for these two regions, enabling assignment of 
some components, such as two Nup205 and two 
Nup358 molecules (/4). However, reliable dock- 
ing and modeling requires higher resolution. 

In this study, we achieved an average res- 
olution of 3.7 A for the core region and 4.7 A 
for the Nup358 region. We also solved a cryo- 
EM structure of the N-terminal o-helical do- 
main of Nup358 at 3.0-A resolution. Relying 
on the improved resolutions and facilitated by 
AlphaFold prediction, we were able to gener- 
ate a structural model for the entire CR of the 
X. laevis NPC. In addition to the previously 
characterized Y complexes, five Nup358, two 
Nup205, and two Nup93 molecules were iden- 
tified to be constituents of the CR subunit. Our 
current structure expands the molecular mass 
by 80% compared with the previously reported 
composite models of the human CR (JO, 12). 


Overall structure of a CR subunit 


To reduce conformational heterogeneity, we 
spread the opened NE onto the grids with 
minimal force and used the chemical cross- 
linker glutaraldehyde to stabilize the NPC. 
Please refer to the materials and methods, 
figs. S1 to S4, and table S1 for details of cryo- 
EM sample preparation, image acquisition, 


data processing, and structural modeling and 
refinement. We previously defined the core 
and the Nup358 regions in each CR unit based 
on their spatial arrangement but were unable 
to reliably assign nucleoporins (Fig. 1B) (4). 
The average resolutions for these two regions 
now reach 3.7 and 4.7 A, with the highest local 
resolution of the core region up to 3.3 A 
(Fig. 1C, Movie 1, and fig. S5). 

Structural modeling for the best-resolved 
components within the CR subunit, including 
the C-terminal fragment of Nup160 (Nup160- 
CTF), Nup93 05, Nup205, and other previously 
recognized structural components of the CR, 
was aided by x-ray structures (12) or AlphaFold 
predictions (75). Assignment of the two Nup93 
a5 helices was facilitated by several bulky side 
chains that are readily discernible in the EM 
map. In addition, the model of this long helix 
predicted by AlphaFold matches the EM map 
exceptionally well (fig. S5). Assignment of the 
residues in each X. laevis nucleoporin was as- 
sisted by sequence alignment with its functional 
orthologs. The EM density in the Nup358 region 
and the periphery of the core region, includ- 
ing all five Nup358 clamps and two Ancestral 
Coatomer Element 1 (ACEI) domains that be- 
long to Nup93 (16), is less well resolved and 
lacks side chain features. Modeling in these 
areas was aided by the separately determined 
atomic structure of Nup358-NTD2 (residues 222 
to 738) (figs. S6 to S9), previous analyses of ver- 
tebrate NPC (4), and AlphaFold predictions (15). 

Altogether, our final model of the CR sub- 
unit consists of 19,037 amino acids in 30 
nucleoporins (Fig. 1, A and D, Movie 1, and 
tables $2 and S3). Except for Nup358, Nup88, 
Nup155, and Nup98, all protein components 
in the current model have two distinct copies 
in each CR subunit (/4). For description clarity, 
the copy closer to and away from the central 
channel will be denoted with -I (inner) and -O 
(outer), respectively. In the following sections, 
we will focus on previously uncharacterized 
structural features of the CR for illustration. 


Nup160-CTF at the vertex of the Y complex 


The Y complexes have been the most extensively 
characterized, with most of their constituents 
known. On the basis of the improved EM map 
of the core region, the previously unknown 
Nup160-CTF, comprising helices 038 to 047, 
was found to sit at the center of the vertex, 
where the short arm, long arm, and stem of 
the Y complex meet (JJ, 17) (Fig. 2, A and B). 
This observation helps to define the organi- 
zation of the vertex in vertebrates. 

Seh1-I from the short arm and Sec13-I from 
the stem are connected to the CTF of Nup160-I 
(Fig. 2, A and C). For the inner Y complex, the 
4CD loop (the loop between strands C and D of 
blade 4) of Seh1 contacts helix 040 of Nup160- 
CTF, whereas the B-3D strand (the D strand of 
blade 3) of Sehl interacts with helix 042 of 
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Fig. 1. Single-particle cryo-EM analysis of the CR subunit of the X. laevis 
NPC. (A) Schematic overview of the architecture of a vertebrate NPC. Shown 
is a cut-open side view of a cartoon depicting an NPC embedded in the NE. 
Each NPC, along its cytonuclear axis, comprises the CF, CR, IR, NR and NB. A 
LR localized inside the lumen of the NE encompasses the IR. The CR 
components from vertebrates and fungi are listed in the table below. The 
copy number of each CR component in a vertebrate NPC is indicated in 
brackets. (B) EM map of the CR at 22.2-A resolution. The core region and the 
Nup358 region in one CR subunit, which are separately calculated as single 
particles, are indicated by black and magenta contours, respectively, in 

the left panel. The densities corresponding to the inner and outer Y 


complexes in one CR subunit are colored pink and blue, respectively. 

(C) Heatmaps for the local resolutions of the core region and the Nup358 
region that were separately reconstructed to average resolutions of 3.7 and 
4.7 A, respectively. The local resolutions were calculated in RELION3.0 

(59) and presented in ChimeraX (58). (D) Overall structure of a CR subunit 
from the X. laevis NPC. The structure is presented in approximately the 
same view as the two Y complexes in the right panel of (B). Protein components of 
the outer Y complex and Nup205, Nup93, and Nup358 molecules are colored 
the same as those annotated in (A). The inner Y complex is colored dark 

gray. Nup155, which connects the CR to the IR, is labeled but not included in the 
table in (A). 


Nup160-CTF (Fig. 2C). The N-terminal «1 helices 
from Nup96 and Secl3 pair up to wedge into 
the crevice between helices 046 and 047 of 
inner Nup160 (Fig. 2C). These interactions 
may stabilize the Y complex. 

A conserved surface loop between «36 and 
a37 of Nup160-I, referred to as the 96-loop, 
closely associates with the ridge of the C-terminal 
helices from Nup96-I (Fig. 2D). Similar to that in 
fungi (78, 19), two C-terminal helices of Nup85-I 
contact the lateral side of the middle portion of 
the o-helical domain of Nup160-I (Fig. 2D). The 
interactions described here for inner Y complex 
are also recapitulated in the outer Y complex. 


Two Nup205 molecules bind the inner and 
outer Y complexes through distinct interfaces 


Our structure identifies two Nup205 molecules, 
defined as Nup205-I and Nup205-0, in each CR 
subunit (Fig. 3A and fig. S5). The two Nup205, 
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which do not contact each other, associate with 
the inner and outer Y complexes through dif- 
ferent interfaces (Fig. 3A). 

Nup205-0 directly contacts the short arm 
and the vertex of outer Y complex (Fig. 3B). On 
one side, the tower helix (72) and six additional 
helices associate with the ridge of Nup85-O in 
the short arm. On the other side, five helices 
contact Nup160-CTF-O at the vertex (Fig. 3B). 
Nup205-O also associates with both arms of 
the inner Y complex through interactions with 
the bottom faces of the B-propellers in Nup43-I 
and Nup37-I (Fig. 3C). Through these inter- 
actions, Nup205-O connects the two Y com- 
plexes within the same CR subunit. 

The interactions of Nup205-I with the mid- 
dle helices of Nup85-I and Nup160-CTF-I are 
generally conserved as their outer counterparts 
(Fig. 3, D to F). However, there is no contact be- 
tween the C-terminal helices of Nup205-I and 


Nup85-I, which are separated by the N-terminal 
propeller of Nup88 and Nup98/X from the 
Nup214 complex (Fig. 3E) (14). 

These structural observations uncover pre- 
viously unrecognized interfaces in the organi- 
zation of the CR scaffold and identify Nup205 
as a new core component of the CR. Distinct 
conformations of the two Nup205 molecules 
demonstrate the structural elasticity that un- 
derlies their versatility for binding to different 
nucleoporins in the proximal and distal CR 
rings and the cytoplasmic filaments. 


Nup93 bridges the Y complexes and Nup205 


Nup93 consists of an N-terminal extended 
a-helix (a5) and an ACEI domain that are 
connected by flexible linkers (fig. S10) (20). 
Our EM maps unveil two Nup93-ACEI. The 
predicted structure of helix a5 of Nup93 fits 
exceptionally well into our EM density that is 
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Movie 1. Single-particle cryo-EM analysis of the CR of the X. laevis NPC. The 70-s movie shows the 
three-dimensional EM reconstructions of the CR (1 to 52 s) and the surface representation of the CR model 
(53 to 70 s). The movie starts with the EM density, contoured at 3.9 o, of the CR at ~20-A resolution. 
Then three-dimensional EM densities of the separately refined core region (average resolution of 3.7 A, 
contoured at 5.9 o) and Nup358 region (average resolution of 4.7 A, contoured at 3.3 o) are illustrated at 
multiple angles. Nucleoporins that are discussed in this paper are color coded. On the basis of these EM 
maps, a structural model of the CR subunit has been generated. 


Nup160-1 


Inner Y complex 


Sec13-l 


Fig. 2. Nup160-CTF is an organizing center for the vertex of the inner Y complex in the CR. (A) Nup160- 
CTF interacts with several components at the vertex of the inner Y complex. In each Y complex, Nup85, 
Nup43, and Sehl constitute the short arm; Nup160 and Nup37 the long arm; and Nup96, Secl3, Nup107, and 
Nup133 the stem. Shown here is a surface representation for the partial inner and outer Y complexes in 
the CR. Protein components in the inner and outer Y complexes are denoted with -| and -O, respectively. 
Components from the inner Y complex are color coded, and the outer Y complex is colored silver. Nup107 
and Nup133 are not shown for visual clarity. (B) Structure of full-length Nup160, which consists of a seven-bladed 
B-propeller followed by an extended a-helical domain. Inset: Helices «38 to «47 of Nup160-CTF are well 
defined in the EM map. The EM density, shown as a semitransparent surface, is contoured at 6 o for clarity. 
(C) Nup160-l-CTF sandwiched by Sehl-! and Secl3-|. (D) The central segments of the Nup160-! a-helical domain 
are pinched by Nup&5-| and Nup96-l. The 96-loop from Nup160, colored magenta, spreads on the surface of 
Nup96. Except for (B), in which the EM map is shown as a semitransparent surface prepared in Chimerax, all 
panels show structures that are presented as cartoon or surface and prepared in PyMol. 


featured with well-resolved side chains of a 
large number of bulky residues (Fig. 4A, figs. 
S5 and S10). 

Nup93-ACE1-O resides at the stem regions 
of both Y complexes (Fig. 4, A and B). The 
interfaces between Nup93-ACE1-O and sur- 
rounding nucleoporins are discernible in fig. 
S11. The middle portion, known as the trunk, 
of Nup93-ACE1-O associates with the trunks 
of Nup107-0 (fig. S11A) and Nup96-I (fig. SIIB). 
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The Nup93-Nup96 association appears to be 
strengthened by Sec13-I, which co-folds with 
Nup96 and uses its blades to contact the crown 
of Nup93 (Fig. 4B and fig. S11B). Through these 
interfaces, Nup93-ACE1-O connects the two Y 
complexes within one CR subunit. 
Nup93-ACEI1-I connects Nup205-O with 
Nup107-I from the neighboring subunit (Fig. 4, 
A and C). On one end, the trunk of Nup93- 
ACE1-I associates with the N-terminal domain 


(NTD) of Nup205-0 from the same subunit. On 
the other end, the tail of Nup93-ACEI-I contacts 
the lateral side of the tail in Nup107-I from the 
adjacent subunit (Fig. 4C). Therefore, Nup93- 
ACE1-I sews two adjacent CR subunits. 

Previous biochemical characterization sug- 
gested interactions between an N-terminal 
region of the fungal ortholog of Nup93 and 
the C-terminal domain (CTD) of Nup205 (27). 
Indeed, helix a5 of Nup93-I may insert into 
the axial groove of the CTD of Nup205-I, and 
helix «5 of Nup93-O in one CR subunit (S1) is 
likely to bind the CTD of Nup205-O in the 
neighboring subunit (S2) in the same manner 
(Fig. 4D). The association between Nup93-a5 
and Nup205-CTD is almost identical to that 
observed in the IR and NR subunits (fig. S12) 
(22-25). 


Five Nup358 clamps in each CR subunit 


Five densities each in the shape of a shrimp tail 
wrap about the stems of the two Y complexes 
(Fig. 5A). They likely belong to five copies of 
Nup358, a notion corroborated by our 3.0-A 
cryo-EM structure of the middle o-helical re- 
gion of Nup358 (residues 222 to 738, desig- 
nated NTD2) (Fig. 5B, figs. S8 and S9). Please 
refer to the materials and methods for details 
of the assignment of the five Nup358 mole- 
cules, which will be hereafter referred to as 
clamps 1 to 5. 

Clamp 1 through clamp 4 do not interact 
with each other. They constitute two pairs, 
clamps 1 and 3 and clamps 2 and 4, which hold 
the stems of the outer and inner Y complexes, 
respectively (Fig. 5, C and D, and fig. S13). 
Clamp 5 connects the two pairs by binding to 
clamps 1 and 4 (Fig. 5, C and D, and fig. S13A). 

The conformational elasticity of Nup358 al- 
lows each of these five clamps to adapt to a 
distinct local environment for optimal inter- 
action with neighboring nucleoporins (figs. S13, 
Bto E, and S14). Aside from the association with 
the Y complexes (figs. $13, C and D, and S14A), 
clamps 2 through 5 make distinct contact with 
Nup93-ACE1-0 (Fig. 5, C and D, and fig. S14, B 
to E). Please refer to the materials and meth- 
ods for details of the interactions. Clamp 5, 
the N terminus of which contacts clamp 1 and 
Nup107-0 (fig. S13E), places its C terminus 
near the central domain of Nup93 (Fig. 5, C 
and D). Together, these Nup358 molecules 
encage Nup93-ACE1-O within a cleft between 
the stems of the two Y complexes (Fig. 5, C 
and D). 

The interaction between Nup358- and ACE1- 
containing proteins mediates the cross-talk 
between the Y complexes (fig. S14A). Previous 
studies identified five candidate interfaces 
between Y complexes in the CR of vertebrate 
NPC (0-12, 18). Our current study uncovers 
additional interfaces that are clustered in two 
regions (fig. S14A and table S5). The first 
region involves Nup96-I and Nup358 clamp 1, 
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Fig. 3. Nup205 wraps around the Y complexes. (A) Nup205-0 associates with both the inner and 
outer Y complexes, whereas Nup205-| only interacts with the inner Y complex. Nup205-0 interacts 
extensively with the short arm and vertex of the outer Y complex, as well as both arms of the inner Y 
complex. By contrast, Nup205-I only interacts with the short arm and the vertex of the inner Y complex. 
(B) Extensive binding interface between Nup205-0-CTD and Nup85-0. (C) Association between Nup205-O 
and the inner Y complex. (D) Structural variations of Nup205. When superimposed relative to Nup-205- 
associated Nup160 and Nup89, the CTDs of the two Nup205 molecules exhibit pronounced conformational 
variations. The inner nucleoporins are color coded, and the outer counterparts are colored gray. 
(E) Nup88 and Nup98/X wedge into the space between Nup205-I-CTD and Nup85-l, disrupting their 
direct association. (F) Nup205-I binds to Sec13-l and Nup96-! from the vertex. The structures, shown as 
cartoon or surface, were prepared in PyMol. 
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where the tail of Nup96-I contacts the crown 
of Nup107-0 and stacks against the N-terminal 
a-solenoid of clamp 1 (fig. S14A, middle and 
right panels). The second one involves Nup205 
and the ACE1 of Nup93 molecules, the inter- 
faces of which are described below. 


Interfaces between CR subunits 


The 3.7-A EM map of the core region was 
individually aligned to each of the eight sub- 
units in the 22.2-A reconstruction, generating 
a composite model of the CR consisting of the 
proximal and distal rings (Fig. 6A) (11, 12, 18). 
For simplicity, we will refer to two adjacent 
subunits within the CR as subunit 1 (Sl) and 
subunit 2 (S2) (Fig. 6). Nup205 and Nup93 
play critical roles in sewing the neighboring 
subunits (fig. S15). 

The NTD of Nup205-0 (S2), together with 
Nup37-O (S2) and Nup160-O (S2), clench the 
a-helical domain of Nup133-I (SI) (fig. S15A). 
The CTD of Nup205-0 (S2) associates with the 
trunk of Nup107-I (S1). The CTD of Nup205-O 
(S2) harbors the docking site for the outer 
copy of Nup93-a5, which is likely linked to 
Nup93-ACE1-0 (S1) (fig. SI5B). Nup93-ACE1-I 
(S2) also connects two adjacent subunits. Its 
tail contacts the lateral side of the tail from 
Nup107-I (S1) on one side, and its trunk 
interacts with the NTD of Nup205-O (S2) on 
the other terminus (fig. SI5B). 

These structural discoveries establish Nup205 
and Nup93 as crucial structural constituents in 
ring formation. 


(S1) 


Nup205-0 | ~e 
(S2) 


clo 


Nup205-I 
(S2) 


Fig. 4. Nup93 bridges Y complexes and Nup205 molecules. (A) Nup93 
molecules bridge adjacent Y complexes as well as Nup205 molecules. 
Identical components from two adjacent CR subunits, designated S1 and S2, 
are presented. Nup205 and the Y complexes are shown in surface 


espectively. Two copies of Nup93 are present in each CR subunit, with 

the one closer to the central pore designated as Nup93-! (hot pink) and the 
other as Nup93-0 (red). For visual clarity, only Nup93-0 from S1 and 
Nup93-I from S2 are shown. (B) Nup93-ACE1-O connect the stems of the 
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epresentation. Inner and outer components are colored dark and light gray, 


inner and outer Y complexes in each CR subunit. The bridging role is achieved 
through direct binding to the ACE1 domains in Nup96-! and Nup107-0. 
These proteins form a triangular ACE1 core at the stem of the Y complexes, 
which is bolstered by Secl3-! and Nup133-0. Previously defined module 
names of the ACE1 domain, crown, trunk, and tail are labeled. (€) Nup93- 
ACE1-| bridges two adjacent subunits through direct binding to the ACE1 of 
Nup107-| from S1 and the NTD of Nup205-0 from S2. (D) The a5 helix 
from Nup93 inserts into the axial groove of Nup205-CTD. The structures, 
shown as cartoon or surface, were prepared in Chimerax. 
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Fig. 5. Five Nup358 clamps connect ACE1 proteins around the stems of 
the Y complexes. (A) EM densities for the five Nup358 clamps within one CR 
subunit. The EM densities corresponding to the five Nup358 clamps were 
individually carved out of the refined local map for the Nup358 region using the 
map segmentation tool in Chimera. The EM maps for the five Nup358 clamps 
were then individually contoured and colored. The contour level for each clamp is 
indicated in brackets. In this panel, Nup93 and Nup107 are colored gray. 

(B) Cryo-EM structure of Nup358-NTD2 at 3.0-A resolution. Left: Resolution 
map of Nup358-NTD2. The local resolutions might be slightly inflated due to a 
variety of factors. Please refer to fig. S7C for representative densities. Right: 
Domain structure of Nup358. UR, unstructured region; FR, functional region. The 
terminal residues are labeled. Four surface motifs of Nup358-NTD2 (N-hook, 
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Fig. 6. Structure of the double-layered CR scaffold. Shown is a composite model of the X. laevis CR. The 
model is shown as surface in ChimeraX. In addition to the two concentric Y-complex rings that each 
comprises eight head-to-tail assembled Y complexes, our present study identifies Nup358, Nup205, and 
Nup93 to be constituents of the CR scaffold. The inner and outer Y complexes in the subunit closest to 
reader are colored sandy brown and wheat, respectively, and in the other seven subunits they are colored 


dark and light gray, respectively. 


Discussion 


N-hook A) se 
28 re V4 Clamp-4 
' Clamp-5 wy. 
Clip 
2.9 (A) helices Inner 
~ YY , Clamp-2 Complex 
3 


Subunit 4 ($1) 


loop-1, clip helices, loop-2), which mediate the cross-talk between Nup358-NTD2 
and ACE1 proteins within the CR, are color coded. (©) Nup93-ACE1-O and 
Nup358 clamps bridge the two Y complexes. Nup93-ACE1-O binds the stems of 
both Y complexes (top panel). The five clamps encage the Nup93-ACE1-0 
(bottom panel). Clamps-4 and -5 directly interact with each other to bridge the 
inner and outer Y complexes. (D) The clamp-1/3 and clamp-2/4 pairs hold 

the stems of the outer and inner Y complexes, respectively. The overall CR 
subunit is shown in two opposite views to highlight the relative positions of the 
five Nup358 clamps and two Nup93 molecules. The inner and outer Y complexes 
are shown as light- and dark-gray surfaces, respectively. The Nup358 clamps 
and Nup93 are color coded and highlighted as cylindrical cartoons. Structures in 
(C) and (D), shown as cartoon or surface, were prepared in ChimeraXx. 


exemplified by chemical cross-linking and 
mass spectrometry (27-29). Collectively, these 
methods have yielded critical insights into the 
overall organization of the NPC (10-12, 30, 31). 
However, it is difficult to obtain accurate dis- 
tance information, to determine precise stoi- 
chiometry, and to pinpoint an interface using 
these indirect methods. The improved EM map 
in our present study reveals a number of pre- 
viously unrecognized features (fig. S5). 

The improved model of the CR subunit pro- 
vides a framework for mechanistic understand- 
ing of the NPC assembly. The NPC is subject to 
assembly and disassembly during a cell cycle. 
The Y complex has been shown to stay as an 
intact subcomplex throughout the open mitosis 
of metazoan (32). Our structural finding that 
the vertebrate-specific Nup160-CTF is at the 
center of the multiprotein vertex of the Y 
complex provides a mechanistic explanation 
for the observed multiple conformations of 
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A wealth of structural information at atomic 
resolution on the NPC have emerged from 
crystal structures of individual nucleopor- 
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ins or subcomplexes in the past two decades 
(5, 7, 26). Structural studies of the assembled 
NPC at moderate resolutions have been com- 
plemented by other biochemical strategies 


the short arm in the Y complexes lacking 
Nup160-CTF (79, 33) (fig. S16). Supporting the 
functional significance of Nup160-CTF, it was 
recently reported that residues 1173 to 1436 
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(corresponding to segments 037 to 047 in the 
structure) of Nup160 were missing in a patient 
with steroid-resistant nephrotic syndrome (34). 

Our EM maps identify two Nup93 molecules 
within each CR subunit, expanding the list of 
ACE1-containing nucleoporins (ACE1 proteins) 
within the CR. In the updated model of the 
CR subunit, four of seven a-helical domain- 
containing nucleoporins have the ACE1 fold 
(20), which exhibit a similar conformation in 
both inner and outer copies (fig. $17). Cross- 
talk among the ACE1 domains of Nup93-O, 
Nup107-0, and Nup96-I, which form an ACE1 
core at the stem region of the Y complexes, is 
mediated through interfaces that differ from 
the canonical crown-crown association (35) 
(Fig. 4B). Our recent analysis of the IR subunit 
also revealed a homodimerization interface of 
Nup93 through its crown and tail (22). These 
observations corroborate a key role of the 
multifaceted ACE1 proteins within the NPC 
assembly (20). Published EM maps (23, 36, 37) 
indicate that the Nup93-ACE1-O is also pre- 
sent in the NR subunit from a vertebrate NPC. 
Together with four molecules of Nup93 in the 
IR subunit (72, 22, 25, 30), there are at least 
seven Nup93 molecules within each NPC spoke. 

Both Nup205 and Nup188 adopt a super- 
helical fold that is structurally reminiscent of 
the karyopherin family (24, 38). Our EM map 
of the core region reveals that two Nup205 
molecules, rather than Nup188 molecules, are 
present within each CR subunit. The IR sub- 
unit of the X. laevis NPC also comprises two 
Nup205 molecules (22), whereas only one 
Nup205-0 molecule is found in the EM maps 
of the NR subunit from X. laevis (23, 36) and 
Homo sapiens (37). These data suggest that at 
least five Nup205 molecules are present with- 
in each NPC spoke. The potentially differen- 
tial placement of Nup205 in the CR and NR 
mark one point of structural asymmetry be- 
tween these two rings. 

Among the eight o-helical nucleoporins in 
the CR subunit, Nup85, Nup96, Nup93, Nup107, 
and Nup205 each contain a single helical 
solenoid that confers their structural elasticity 
(fig. S17). Nup133 and Nup160 each contain an 
elongated a-helical domain. Unlike the others, 
Nup358 contains multiple tetratricopeptide re- 
peat (TPR)-like repeats (39) that are clustered 
to two distinct superhelical solenoids. The 
unique shape and structural features of Nup358 
further support our assignment of Nup358 to 
the five clamp-shaped EM densities. Relative 
movements of the two solenoids confer ad- 
ditional conformational elasticity on Nup358 
(fig. S13B). Identification of the fifth copy of 
Nup358 expands our understanding of the 
role of Nup358 and highlights its complex 
modality in the assembly of CR. Both Nup358 
clamp 1 and Nup93-ACE1-O engage in direct 
interfaces that connect the inner and outer 
Y complexes (Figs. 4 and 5 and figs. S11 and 
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$14). Such interweaving may strengthen the 
association between the two concentric rings 
of the vertebrate CR (/0). In total, our current 
model accounts for <30% of the full-length 
Nup358. The invisible functional domains of 
Nup358 may project into the cytosol for events 
such as Ran binding (7). Because of residual 
anisotropy and structural flexibility, the cur- 
rent EM density falls short of sequence assign- 
ment for the peripheral regions of the CR 
subunit, including five Nup358 clamps and 
the two Nup93 ACE1 domains. Future improve- 
ment for these regions of the CR subunit may 
confer a more conclusive assignment. 

In summary, our final model reported in 
this study, with >200 nucleoporins assigned, is 
estimated to account for ~90% of the molec- 
ular mass for the ordered region of a verte- 
brate CR. Assembly of the CR scaffold relies on 
the a-helical nucleoporins; the CR scaffold is 
structurally solidified by 12 B-propellers in 
each CR subunit (fig. S18). 


Materials and Methods 
Cryo-EM sample preparation 


The NE from X. laevis oocytes was prepared 
largely as previously described (9, 14) but with 
the following improvements. First, to reduce 
conformational heterogeneity of the NPC caused 
by mechanical distortion, the opened NE was 
spread onto the grids as gently as possible, 
with minimal force applied. This practice is 
in contrast to methods used in previous studies 
(9, 14), in which mechanical force was applied 
to ensure flat placement of the NE on the grids. 
Second, to stabilize the NPC conformation, the 
NE on the grids was cross-linked for 30 min on 
ice using 0.5% glutaraldehyde in a low-salt 
buffer 10 mM Hepes, pH 7.5, 1 mM KCl, and 
0.5 mM MgCl). Third, copper grids were re- 
placed by gold ones in this study. Finally, to 
improve sample uniformity, each batch of sam- 
ples was prepared on the same day using 
oocytes from the same frog. The gold EM grids 
(R1.2/1.3, R2/1, and R2/2; Quantifoil, Jena, 
Germany) were blotted for 8 s with a blot force 
of 15 and vitrified by plunge-freezing into liquid 
ethane using a Vitrobot Mark IV (Thermo 
Fisher Scientific) at 8°C under 100% humidity. 
The quality of the sample was examined using 
an FEI Glacios microscope (Thermo Fisher 
Scientific) operating at 200 kV. 


Data acquisition of intact X. laevis NPC 


Grids were transferred to a Titan Krios electron 
microscope (FEI) operating at 300 kV and 
equipped with a Gatan GIF Quantum energy 
filter (slit width 20 eV). A total of 46,143 
micrographs were recorded with the grids tilt- 
ing at angles of 09, 30°, 459, and 55° (14, 40). A 
K3 detector (Gatan) was used in the super- 
resolution mode with a nominal magnifica- 
tion of 64,000x, resulting in a calibrated pixel 
size of 0.6935 A for the movie files (fig. SI and 


table S1). The movie images were then binned 
twice during motion correction, arriving at 
a pixel size of 1.387 A for the final motion- 
corrected images. The total dose followed a 
cosine alpha scheme in which the dose is 
inversely proportional to the cosine value of 
the tilting angle. Within each stack, the ex- 
posure time for each frame and the dose rate 
were kept the same. Detailed statistics of data 
collection are reported in table S1. All frames 
in each stack were first aligned and summed 
using MotionCor2 (41). Dose weighting was 
also performed using MotionCor2 (41). The av- 
erage defocus values were set between -1.5 and 
-3.0 um and estimated using Gctf (42). 


Initial model of the CR 


We examined all 46,143 micrographs and man- 
ually selected 33,747 for further processing. 
A total of 800,825 particles were manually 
selected from these micrographs (fig. S2A). 
Initial defocus estimation was performed as 
previously described (74) before all other data- 
processing procedures. 

We used the 18-A map of the CR from our 
previous study (74) as the initial reference, which 
was resampled to a pixel spacing of 11.096 A 
and low-pass filtered to 40 A to reduce po- 
tential model bias. We then performed one run 
of a global search (K = 1) three-dimensional 
classification, with the solvent mask covering 
only the CR. The search had 40 iterations. The 
resulting star files from the last several itera- 
tions were then subjected to local search three- 
dimensional classifications using multiple 
reference seeds for guidance. All good classes 
were then merged and duplicated particles were 
removed, generating a dataset of 660,302 NPC 
particles. A final round of autorefinement with 
a solvent mask on the CR resulted in a recon- 
struction at 22-A resolution (fig. S2A). The 
resulting reconstruction was nearly identical 
to that in our previous study (74) (fig. S2B). C8 
symmetry was applied throughout this stage 
of data processing. 


Data processing and reconstruction of the 
CR subunit 


We extracted the CR subunit particles on the 
basis of the alignment parameters of the 22-A 
CR reconstruction. We updated the orienta- 
tion, shift, and defocus parameters for each 
subunit according to a published protocol (74). 
In short, a cropping center for each subunit 
was defined in the map, and the Euler angles 
along with two-dimensional shifts for each sub- 
unit particle were deduced from the CR particle. 
The recentering procedure was repeated eight 
times, defining a three-dimensional geometric 
relationship among eight subunits of the same 
CR. A total of 5,148,474 particles of the CR sub- 
unit were extracted using a box size of 256 and a 
binned pixel size of 2.774 A (fig. S3A). We then 
performed three rounds of CTF refinement with 
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geometrical restraint among subunits of the 
same CR and five rounds of guided multiref- 
erence three-dimensional classification. This 
practice allowed the selection of 2,477,433 CR 
subunits, which yielded a reconstruction of the 
core region at 5.77-A resolution. To fully use the 
dataset, these subunits were projected back 
to the original CR particles. The defocus values 
of all subunit particles within the same CR 
were pooled together to calculate a corrected 
average defocus value for the center of mass 
of each CR particle. All subunits of this CR were 
then reextracted using the updated defocus val- 
ue deduced from the corrected average defocus 
of this CR particle. Through this procedure, 
4,528,642 subunit particles from 581,882 CR 
particles were extracted, resulting in a recon- 
struction at 5.6 A after autorefinement (fig. S3A). 
This dataset was then used for reextraction 
with a box size of 400 and pixel size of 1.387 A 
(fig. S3B). Following our published protocol (/4), 
data processing beyond this point was per- 
formed individually for four datasets defined 
by the tilting angles: tiltO, tilt30, tilt45, and 
tilt55. This was because of our observation 
that reconstruction at relatively high resolu- 
tion is easily biased toward low-tilt datasets 
due to their relatively high signal-to-noise ratio. 
In fact, the high-tilt and low-tilt datasets ex- 
hibited quite different motion statistics (fig. S1). 
The four datasets were then separately recon- 
structed to obtain an estimation of signal-to-noise 
ratio for their respective tilting angles. After pol- 
ishing and CTF refinements, the four datasets 
were pooled for three-dimensional autorefine- 
ments and three-dimensional classifications. 
A second strategy to enhance the quality of 
the reconstruction is to improve the results of 
particle polishing. In particle polishing, initial 
movement tracks are obtained either through 
global frame alignment or polynomial motion 
models determined from local motion trajec- 
tories (43). The magnitude of the sample drifts 
in the tilted datasets appears much greater 
than that in the untilted ones (fig. $1). This 
problem, along with increased sample thick- 
ness at high-tilt angles, prevents the local 
motion model from accommodating sample 
deformation. An initial motion model esti- 
mation using MotionCor?2 resulted in 18,619 
failed attempts to explain local sample defor- 
mation in 38,583 movie stacks using the 
polynomial model. In particular, sample de- 
formation was successfully explained for only 
6825 movie stacks out of a total of 22,132 for 
the tilt45 and tilt55 datasets. Errors introduced 
by the polynomial motion model undermine 
the accuracy of the initial movement tracks for 
particle polishing, thus affecting the final out- 
come. To address this issue, we took advantage 
of the large size of a single NPC particle and 
attempted to estimate only the local motion 
around each NPC particle instead of dividing 
the micrograph into rectangular patches and 
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estimating motion for each patch. A major 
rationale for this practice is that empty patches, 
which occur often, may engender misalignment, 
thereby compromising the accuracy of the 
polynomial motion model. To improve align- 
ment, we used a polynomial motion model 
similar to those implemented in MotionCor2 
(41), RELION (44), and WARP (45) to regularize 
movement tracks of particles that belong to the 
same micrograph. This polynomial model was 
iteratively refined until convergence. A final 
round of patch-based alignment using only 
information around each NPC particle was 
performed to account for any residual move- 
ment of each particle. Finally, any micrograph 
that failed to converge or had residual motion 
of >15 pixels in any direction was removed. 

Using the above strategy, we were able to 
rescue 13,758 micrographs from 18,619 that 
failed to provide reliable local motion models. 
We separately reconstructed the four tilt datasets 
and performed one round of CTF refinement. 
The results were directly subjected to autorefine- 
ment, resulting in a reconstruction of the core 
region at 4.5-A resolution (fig. S3B). The com- 
bined dataset was then redivided into four 
subsets according to tilting angle and sub- 
jected to particle polishing and CTF refinement 
in RELION. The resulting particles were again 
pooled together and subjected to autorefine- 
ment and multireference three-dimensional 
classifications. The above cycle was repeated 
several times. In the last few cycles, magnification 
anisotropy refinement and per-micrograph as- 
tigmatism refinement were also performed to 
reduce distortion due to incorrect magnifica- 
tion and astigmatism. Additionally, the box 
size of the used particles was enlarged during 
particle polishing for the last few cycles using 
the window and scale option from relion_ 
motion_refine. These measures resulted in a 
final average resolution of 3.7 A from 1,279,270 
particles (fig. S3B and table S1). 

For the Nup358 region, the particles were 
subjected to local refinement in cryoSparc (46) 
to reach an average resolution of 4.7 A (fig. 
S3B, and table S1). The EM maps in the core 
region display distinct features for secondary 
structural elements and some of the bulky 
amino acid side chains (figs. S4 and S5). These 
features facilitated sequence assignment of 
the nucleoporins and identification of protein- 
protein interfaces. 


Expression and purification of Nup358-NTF 


The cDNA encoding the N-terminal 1300 
amino acids of X. laevis Nup358 (Uniprot: 
AOAIL8HGL2) was synthesized with codon 
optimization (Qinglan Biotech). The N-terminal 
fragment (residues 1 to 1171, referred to as 
Nup358-NTF) with an N-terminal Flag tag was 
cloned into the pCAG vector. This construct 
was verified by DNA sequencing. HEK293F cells 
(Invitrogen) were cultured in SMM 293T-II 


medium (SinoBiological) supplemented with 
5% CO» in a Multitron-Pro shaker at 130 rpm 
at 37°C. The cells were transfected at a den- 
sity of ~2 x 10° cells/ml. SMS 293-SUPI 
(SinoBiological) was supplemented into the 
culture 24 hours after the transfection. The 
cells were then cultured for additional 24 to 
36 hours. 

The transfected HEK293F cells were har- 
vested by centrifugation at 3800g and resus- 
pended in a lysis buffer containing 25 mM 
Tris-HCl, pH 8.0, 150 mM NaCl, and a cocktail 
of protease inhibitors (VWR). The final con- 
centrations of the inhibitors were 2 mM 
for phenylmethylsulfonyl fluoride (PMSF), 
5.2 ug/ml for aprotinin, 2.8 ug/ml for pep- 
statin, and 10 ug/ml for leupeptin. The cells 
were lysed by ultrasonication (Vibra-Cell, 
SONICS). After centrifugation at 30,000g for 
1 hour, the supernatant was loaded into an 
anti-Flag M2 affinity gel (Sigma-Aldrich) col- 
umn and eluted with the lysis buffer sup- 
plemented with 200 ug/ml Flag peptide. The 
eluted fraction was concentrated using a 50-kDa 
cut-off Centricon (Millipore) and applied to size- 
exclusion chromatography (Superose-6, GE 
Healthcare). The peak fractions were ana- 
lyzed by SDS-polyacrylamide gel electropho- 
resis (fig. S6). 


Cryo-EM data acquisition for Nup358-NTF 


An aliquot of 4 ul of freshly purified Nup358- 
NTF at a concentration of ~2 mg/ml was 
placed on glow-discharged holey carbon grids 
(Quantifoil Au 300 mesh, R1.2/1.3). Grids were 
blotted for 3.5 s and plunge-frozen in liquid 
ethane cooled by liquid nitrogen using Vitrobot 
Mark IV (Thermo Fisher) at 8°C under 100% 
humidity. The grids were transferred to a Titan 
Krios electron microscope (Thermo Fisher) 
operating at 300 kV and equipped with a GIF 
Quantum energy filter (Gatan). Using the 
same setup as that for intact X. laevis NPC data 
acquisition, a total of 17,147 movie stacks were 
recorded. WARP was used for all subsequent 
preprocessing, including CTF estimation and 
motion correction (45). 


Image processing for Nup358-NTF 


Out of a subset of 1176 micrographs, 627,188 
particles were first autopicked using Topaz (47), 
with its default model on micrographs pre- 
binned by eight times. Particles were extracted 
using a box size of 100 and a pixel size of 2.774 A. 
These particles were then subjected to four 
rounds of two-dimensional classifications in 
cryoSparc (46), yielding 201,480 particles. The 
good particles were then converted to RELION- 
compatible star files using the csparc2star.py 
program from the pyem suite (48). These 
particles were used to train a deep learning 
model in Topaz. Relying on this model, we 
picked 2,418,968 particles from 17,147 micro- 
graphs using a box size of 100 and a binned 
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pixel size of 2.774 A. These particles were 
directly subjected to two additional rounds of 
two-dimensional classifications in RELION, 
yielding a dataset containing 2,047,018 particles 
(fig. S7A). These particles were then reextracted 
using a box size of 200 and a pixel size of 1.387 A 
and imported into cryoSparc. Eight repetitions 
of initial model generation were executed, and 
the best initial model was selected on the basis 
of visual inspection. 

We then used csparc2star.py to convert the 
213,534 particles assigned to this class into 
RELION-compatible star files and used the 
orientational parameters contained within to 
generate a RELION-compatible reconstruction. 
Using this initial model, one round of random- 
phase three-dimensional classification with the 
random_mask and solvent_mask options turned 
on was performed on the whole dataset. Similar 
to a previously published procedure (49), all 
other references except for class 1 were phase 
randomized to generate a “bad” reference with 
the user-specified phase randomization upper 
and lower limits. To further distinguish pro- 
tein from solvent, the EM density outside of 
the user-supplied random_mask was kept un- 
changed in the “bad” references; the EM den- 
sity inside the random_mask was subjected 
to the phase randomization procedure. Con- 
versely, the EM density inside the random_ 
mask was kept unchanged in the “good” ref- 
erence and that outside was subjected to phase 
randomization. 

Using this approach, we were able to per- 
form image alignment for the relatively small 
Nup358-NTF protein for up to 40 iterations. 
After the global search random phase classi- 
fication procedure, data stars from the last 
several iterations from the global search run 
were subjected to local search multireference 
three-dimensional classifications. The good 
classes were merged and duplicated particles 
were removed. Finally, the remaining 744,385 
particles were imported into cryoSparc for a 
final round of local refinement to yield recon- 
structions at an average resolution of 3.0 A for 
two separate regions that were flexible relative 
to each other (fig. S7A and table S1). The final 
EM reconstruction exhibits clear features for 
sequence assignment (fig. $7, B and C). 


Structural modeling of the CR subunit 


The 4.7-A map of the Nup358 region contains 
five similar, clamp-shaped density patches. To 
facilitate modeling of these five clamps, we 
first generated the atomic model and assigned 
sequence register for Nup358-NTD2 (residues 
222 to 738) on the basis of the 3.0-A EM map 
of Nup358-NTF (fig. S8). The atomic model of 
Nup358-NTD2 was refined against the 3.0-A 
map using Phenix real-space refine (50) with 
secondary structure restraints. 

The final atomic model of Nup358-NTD2 
can be placed into each of these five density 
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patches with little adjustment (fig. S9). On the 
basis of the x-ray structure of human Nup358- 
NTDI1 (39), we generated a homology model 
for NTD1 (residues 1 to 145) of Nup358 from 
X. laevis. Nup358-NTD1 and Nup358-NTD2 
were separately docked into each of the five 
clamp-shaped density areas. In each of these 
five areas, this practice leaves a similar den- 
sity patch unfilled between NTD1 and NTD2. 
This unfilled density patch was assigned to 
the sequences between NTD1 and NTD2 (res- 
idues 146 to 221), which were predicted to be 
a-helical. We generated a model for residues 
1 to 738 of Nup358, which was individually fit 
into each of the five clamp-shaped density 
areas. This observation suggests the presence 
of a fifth Nup358 molecule in addition to four 
speculated Nup358 molecules in the Nup358 
region (fig. S9) (29). 

Apart from the newly resolved structure 
of Nup358-NTD2, modeling for the CR sub- 
unit were largely based on the coordinates 
of X. laevis CR from our previous study [Pro- 
tein Data Bank (PDB) ID: 6LK8] (/4), which 
were fitted into the new reconstruction of the 
CR subunit using Chimera (57). For the core 
region, the secondary structural elements were 
manually checked and adjusted on the basis 
of the much-improved EM density map using 
Coot (52). The reconstruction of the core re- 
gion at 3.7-A resolution allowed us to identify 
a large number of bulky residues in compo- 
nents of both the inner and outer Y com- 
plexes and in Nup205. We de novo modeled 
10 a-helices in the newly assigned C-terminal 
fragment of Nup160 (Nup160-CTF), which was 
manually adjusted on the basis of the sequence 
conservation, and secondary structural elements 
were extracted from the AlphaFold model (5). 
For the less-well-resolved N-terminal regions of 
Nup160-O, modeling was performed by docking 
of the coordinates of the N-terminal regions 
of Nup160-I into the EM density of core re- 
gion low-pass filtered to 10 A, in which the 
outer long arm is clearly discernible. In addi- 
tion, the B-propeller of Nup88, the autopro- 
teolytic domain of Nup98, and the CTD of 
Nup155 were modeled using the predicted 
structures. The current EM density does not 
allow for reliable docking for Nup98. We refer 
to this model as Nup98/X herein. 

For modeling of Nup205 molecules, we gen- 
erated the secondary structural elements of 
Nup205 on the basis of structural alignment of 
its functional orthologs and the crystal struc- 
ture of Nup205 orthologs in fungi using the 
AlphaFold model (/5). Relying on structural 
features of the IR subunit in various organ- 
isms (22, 24, 25), previous biochemical char- 
acterizations of the Nup205 ortholog in fungi 
(21), and the EM density (fig. S5), we also 
modeled two Nup93 a5 helices inserted into 
the CTD of Nup205 molecules. Our model 
covers the entire a-helical region of Nup205, 


including a CTD- and vertebrate-specific TAIL-C 
(14) that harbor the docking site of Nup93 a5. 

In the center of the Nup358 region, the 
bridge domain of unknown identity connects 
the Nup358 clamps and associates with other 
surrounding nucleoporins (4). In both the 
NR from frog and human NPC, a rod of EM 
density with nearly identical shape resides in 
the same place as the bridge domain within 
the two concentric Y complex rings. Previous 
analysis pointed to TPR, a nuclear basket- 
specific nucleoporin, as a candidate that may 
occupy this location (10). The 4.7-A EM map 
of the Nup358 region allowed generation of a 
poly-Ala model for the bridge domain. The 
poly-Ala model was used to search the PDB 
using the DALI server (53). This search led to 
identification of a number of potential candi- 
dates, of which the yeast NPC component Nic96 
tops the list, suggesting that the bridge domain 
is Nup93, the X. laevis ortholog of Nic96. 

Near the end of our study, DeepMind re- 
leased a database of 20,000 predicted struc- 
tures of the human proteome, which includes 
that of human Nup93 (54). To our satisfaction, 
the poly-Ala model of the bridge domain and 
the predicted model of human Nup93 display 
a root mean square deviation of 2.43 A over 
416 aligned Ca atoms (fig. SIOA). In fact, the 
predicted structure of human Nup93 can be 
directly docked into the EM density map for 
the bridge domain without adjustment (fig. 
S10B). The nearly perfect fitting of human 
Nup93 at the secondary structure level strongly 
supports the identification of the bridge do- 
main to be Nup93. This conclusion is further 
supported by mass spectrometric analysis of 
the cross-linked X. laevis NE, in which Nup93 
was found to be mainly cross-linked to Nup358 
among the nucleoporins (fig. S10C). 

Using AlphaFold (15), we generated a pre- 
dicted atomic model for X. laevis Nup93, which 
fits the EM density map exceedingly well 
(fig. S1OD). The surface loop region of the 
predicted model for X. laevis Nup93 was 
manually adjusted into the EM density. This 
final model of X. laevis Nup93 is almost identical 
to our initial poly-Ala model of the bridge do- 
main but contains more features. The featured 
model of X. laevis Nup93 (fig. SIOE) was used 
for all subsequent structural analysis. The struc- 
ture of X. laevis Nup93 a-solenoid (residues 180 
to 820), with a fold-back conformation charac- 
teristic of ACEI (20), consists of a trunk module 
(a6 to a9 and 18 to @28), a crown module (a10 
to a17), and a tail module (029 to a37) (Fig. 4B 
and fig. S10E). In addition, in the N terminus of 
the Nup93 a-solenoid, an extended o-helix (a5) 
inserted in the CTD of Nup205 was also mod- 
eled (figs. S5 and S10E). The corresponding re- 
gion in yeast Nic96 was reported to bind the 
CTD of Nup192 (the fungal homolog of Nup205) 
(21). Secondary structural elements were assigned 
on the basis of the sequence alignment of 
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functional orthologs of Nup93 and the Alpha- 
Fold model. The EM maps also allow identifi- 
cation of protein-protein interfaces mediated by 
Nup93 (figs. S11, S12, and S14). 


With one Nup93-ACE1 assigned to the bridge 


domain and two Nup93 a5 helices identified to 
be inserted into the CTD of Nup205 molecules, 
we investigated whether a second copy of 
Nup93-ACE1 was present within the CR sub- 
unit. The current molecular model left a chunk 
of unassigned density spanning the Nup107-I 
from one subunit and Nup205-O from the 
adjacent subunit that adopts a similar shape as 
those of the bridge domain. Local refinements 
focusing on this chunk of density revealed that 
a second copy of Nup93-ACE1 was indeed 
present in the CR subunit. Direct docking of 
the model of Nup93-ACE1-O into this region 
resulted in a nearly perfect match (fig. SIOF). 


Modeling and structural analyses of other 


nucleoporins were facilitated by sequence 
alignment (55, 56) of functional orthologs of 
the structurally defined CR components and 
AlphaFold-predicted models. Overall, the final 
model of the CR subunit includes the inner and 
outer Y complexes, two Nup205 molecules, five 
Nup358 clamps, two Nup93 molecules, and one 
copy each of Nup88, Nup98/X, and Nup155. 
This model includes 19,037 amino acids in 749 
a-helices and 380 B-strands (tables S2 and S3). 
The final model was refined using Phenix (50) 
with secondary structure restraints and vali- 
dated through examination of the Molprobity 
scores and statistics of the Ramachandran plots 
(57) (table S1). Together with the EM density 
map, this model allows structural analysis of 
Nup205 (Fig. 3), Nup93 (Fig. 4 and figs. S10, S11, 
$12, and S14), and Nup358 (Fig. 5 and figs. S13 
and S14) clamps. In addition, the final model 
provides a basis for the identification of inter- 
subunit protein-protein interfaces (fig. S15) and 
structural comparisons among Y complexes 
from various species (fig. S16) and between 
individual components of the CR (fig. S17 and 
table S4). 


The figures for structures and maps were 


generated using Pymol, Chimera, or ChimeraX 
(51, 58). Sequence alignments were done in 
Clustal Omega (55) and ESPript 3.0 (56). 
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INTRODUCTION: The subcellular compartmen- 
talization of eukaryotic cells requires selective 
transport of folded proteins and protein- 
nucleic acid complexes. Embedded in nuclear 
envelope pores, which are generated by the 
circumscribed fusion of the inner and outer 
nuclear membranes, nuclear pore complexes 
(NPCs) are the sole bidirectional gateways for 
nucleocytoplasmic transport. The ~110-MDa 
human NPC is an ~1000-protein assembly 
that comprises multiple copies of ~34 dif- 
ferent proteins, collectively termed nucleo- 
porins. The symmetric core of the NPC is 
composed of an inner ring encircling the 
central transport channel and outer rings 
formed by Y-shaped coat nucleoporin com- 
plexes (CNCs) anchored atop both sides of 
the nuclear envelope. The outer rings are 
decorated with compartment-specific asym- 
metric nuclear basket and cytoplasmic 
filament nucleoporins, which establish trans- 
port directionality and provide docking sites 
for transport factors and the small guanosine 
triphosphatase Ran. The cytoplasmic fila- 
ment nucleoporins also play an essential 
role in the irreversible remodeling of mes- 
senger ribonucleoprotein particles (mRNPs) 
as they exit the central transport channel. 
Unsurprisingly, the NPC’s cytoplasmic 
face represents a hotspot for disease-asso- 
ciated mutations and is commonly tar- 
geted by viral virulence factors. 


RATIONALE: Previous studies established 
a near-atomic composite structure of the 
human NPC’s symmetric core by com- 
bining (i) biochemical reconstitution to 
elucidate the interaction network between 
symmetric nucleoporins, (ii) crystal and 
single-particle cryo-electron microscopy 
structure determination of nucleoporins 
and nucleoporin complexes to reveal 
their three-dimensional shape and the 
molecular details of their interactions, (iii) 
quantitative docking in cryo-electron tomog- 
raphy (cryo-ET) maps of the intact human NPC 
to uncover nucleoporin stoichiometry and po- 
sitioning, and (iv) cell-based assays to validate 
the physiological relevance of the biochem- 
ical and structural findings. In this work, we 
extended our approach to the cytoplasmic 


cen 
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proj 


filament nucleoporins to reveal the near-atomic 
architecture of the cytoplasmic face of the hu- 
man NPC. 


RESULTS: Using biochemical reconstitution, 
we elucidated the protein-protein and protein- 
RNA interaction networks of the human and 
Chaetomium thermophilum cytoplasmic fila- 
ment nucleoporins, establishing an evolution- 


Cytoplasmic face of the human NPC. Near-atomic composite 
structure of the NPC generated by docking high-resolution crystal 
structures into a cryo-ET reconstruction of an intact human NPC. 
The symmetric core, embedded in the nuclear envelope, is 
decorated with NUP358 (red) domains bound to Ran (gray), flexibly 
ected into the cytoplasm, and CFNCs (pink) overlooking the 


tral transport channel. 


arily conserved heterohexameric cytoplasmic 
filament nucleoporin complex (CFNC) held to- 
gether by a central heterotrimeric coiled-coil hub 
that tethers two separate mRNP-remodeling 
complexes. Further biochemical analysis and 
determination of a series of crystal structures 
revealed that the metazoan-specific cytoplasmic 
filament nucleoporin NUP358 is composed of 


16 distinct domains, including an N-terminal 
S-shaped a-helical solenoid followed by a 
coiled-coil oligomerization element, numerous 
Ran-interacting domains, an E3 ligase domain, 
and a C-terminal prolyl-isomerase domain. Phys- 
iologically validated quantitative docking into 
cryo-ET maps of the intact human NPC revealed 
that pentameric NUP358 bundles, conjoined 
by the oligomerization element, are anchored 
through their N-terminal domains to the central 
stalk regions of the CNC, projecting flexibly 
attached domains as far as ~600 A into the cyto- 
plasm. Using cell-based assays, we demon- 
strated that NUP358 is dispensable for the 
architectural integrity of the assembled inter- 
phase NPC and RNA export but is required for 
efficient translation. After NUP358 assign- 
ment, the remaining 4-shaped cryo-ET den- 
sity matched the dimensions of the CFNC 
coiled-coil hub, in close proximity to an 
outer-ring NUP93. Whereas the N-terminal 
NUP93 assembly sensor motif anchors the 
properly assembled related coiled-coil 
channel nucleoporin heterotrimer to the 
inner ring, biochemical reconstitution 
confirmed that the NUP93 assembly sen- 
sor is reused in anchoring the CFNC to 
the cytoplasmic face of the human NPC. 
By contrast, two C. thermophilum CFNCs 
are anchored by a divergent mechanism 
that involves assembly sensors located in 
unstructured portions of two CNC nucle- 
oporins. Whereas unassigned cryo-ET 
density occupies the NUP358 and CFNC 
binding sites on the nuclear face, docking 
of the nuclear basket component ELYS 
established that the equivalent position 
on the cytoplasmic face is unoccupied, 
suggesting that mechanisms other than 
steric competition promote asymmetric 
distribution of nucleoporins. 


CONCLUSION: We have substantially ad- 
vanced the biochemical and structural 
characterization of the asymmetric nu- 
cleoporins’ architecture and attachment 
at the cytoplasmic and nuclear faces of 
the NPC. Our near-atomic composite 
structure of the human NPC’s cytoplas- 
mic face provides a biochemical and 
structural framework for elucidating the 
molecular basis of mRNP remodeling, 
viral virulence factor interference with 
NPC function, and the underlying mecha- 
nisms of nucleoporin diseases at the cytop- 
lasmic face of the NPC. 
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The nuclear pore complex (NPC) is the sole bidirectional gateway for nucleocytoplasmic transport. 
Despite recent progress in elucidating the NPC symmetric core architecture, the asymmetrically 
decorated cytoplasmic face, essential for messenger RNA (mRNA) export and a hotspot for nucleoporin- 
associated diseases, has remained elusive. Here we report a composite structure of the human 
cytoplasmic face obtained by combining biochemical reconstitution, crystal structure determination, 
docking into cryo-electron tomographic reconstructions, and physiological validation. Whereas species- 
specific motifs anchor an evolutionarily conserved ~540-kilodalton heterohexameric cytoplasmic 
filament nucleoporin complex above the central transport channel, attachment of the NUP358 
pentameric bundles depends on the double-ring arrangement of the coat nucleoporin complex. Our 
composite structure and its predictive power provide a rich foundation for elucidating the molecular 


basis of mRNA export and nucleoporin diseases. 


he sequestration of genetic material in 
the nucleus represents one of the hall- 
marks of evolution but creates the nec- 
essity for selective bidirectional transport 
across the nuclear envelope (J-4). The nu- 
clear pore complex (NPC) is the sole gateway 
through which folded proteins and protein- 
nucleic acid complexes cross the nuclear en- 
velope, making this transport organelle an 
essential machine for all eukaryotic life. Besides 
its direct role as a transport channel, the NPC 
serves as an organizer for nuclear and cyto- 
plasmic processes that are essential for the 
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flow of genetic information from DNA to RNA 
to protein, including transcription, spliceosome 
assembly, mRNA export, and ribosome assem- 
bly (7-4). Dysfunction of the NPC or its com- 
ponents represents a major cause of human 
disease (2, 5, 6). 

Architecturally, the NPC consists of a central 
core with eightfold rotational symmetry across 
anucleocytoplasmic axis and twofold rotation- 
al symmetry across the plane of the nuclear 
envelope, which links to compartment-specific 
asymmetric cytoplasmic filaments (CFs) and 
a nuclear basket structure (Fig. 1A) (/, 2). 
The NPC is built from ~34 different proteins, 
termed nucleoporins (nups), that are orga- 
nized into distinct subcomplexes. Multiple 
copies of each nup in the NPC add up to an 
assembly that reaches a molecular mass of 
~110 MDa in vertebrates. The symmetric core 
of the NPC is composed of an inner ring and 
two spatially segregated outer rings. The inner 
ring is embedded in nuclear envelope pores 
generated by the circumscribed fusion of the 
double membrane of the nuclear envelope. The 
diffusion barrier is formed by unstructured 
phenylalanine-glycine (FG) repeats that fill the 
central transport channel, imposing a grad- 
ually increasing barrier to passive diffusion 
of macromolecules >40 kDa (J-4). Transport 
factors, collectively termed karyopherins, over- 
come the diffusion barrier by binding to FG 
repeats, thereby transporting cargo across the 
nuclear envelope (7-9). A substantial fraction 
of the FG repeats in the inner ring is con- 
tributed by a heterotrimeric channel nup 
complex (CNT), which is anchored by a single 


assembly sensor motif (10-12). The outer rings 
sit atop the nuclear envelope, sandwiching the 
inner ring from both sides. The outer rings 
are primarily formed by the Y-shaped coat 
nup complex (CNC; also referred to as the 
Y-complex or the Nup107-160 complex) and 
serve as a platform for asymmetric incorpo- 
ration of the CF and nuclear basket nups. 

Two decades ago, the atomic-level character- 
ization of the NPC began with individual nup 
domains and progressed to nup complexes 
of increasing size and complexity, culminat- 
ing in the ~400-kDa heteroheptameric CNC 
(11, 13-28). Simultaneously, advances in cryo- 
electron tomography (cryo-ET) data acquisi- 
tion and processing gradually increased the 
resolution of intact NPC three-dimensional 
(3D) reconstructions (29). Docking of the CNC 
into a ~32-A cryo-ET map of the intact human 
NPC demonstrated that two reticulated eight- 
membered CNC rings, linked by head-to-tail 
interactions, are present on each side of the 
nuclear envelope (27, 30). Moreover, this ad- 
vance established that the resolution gap 
between high- and low-resolution structural 
methods can be overcome by combining bio- 
chemical reconstitution and x-ray crystallo- 
graphic characterization of nups with cryo-ET 
reconstruction of the intact NPC. Expansion 
of this approach to the nine nups constitut- 
ing the inner ring rapidly led to the recon- 
stitution of two distinct ~425-kDa inner-ring 
complexes (IRCs) and the elucidation of their 
components’ structures (10-12, 20, 31-38). In 
turn, this advance enabled the determination 
of the near-atomic composite structure of the 
entire ~56-MDa symmetric core of the human 
NPC, establishing the stoichiometry and place- 
ment of all 17 symmetric nups within a ~23-A 
cryo-ET reconstruction (38, 39). Subsequently, the 
architecture of the Saccharomyces cerevisiae 
NPC was determined by means of a similar 
approach, using high-resolution nup crystal 
structures and ~25-A cryo-ET maps of either 
detergent-purified or in situ NPCs (40, 41). Com- 
pared with the human NPC, the S. cerevisiae 
NPC lacks the distal CNC ring and associated 
nups on both sides of the nuclear envelope, 
but the relative nup arrangement within the 
rest of the symmetric core remains essentially 
identical (38, 39, 42). 

Projecting from the cytoplasmic face of the 
NPC, the CF nups recruit cargoetransport fac- 
tor complexes for nucleocytoplasmic transport 
and orchestrate the export and remodeling of 
messenger ribonucleoprotein particles (mRNPs) 
in preparation for translation (2, 43). The nine- 
component CF nup machinery represents a 
hotspot for human diseases ranging from 
degenerative brain disorders and cardiac dis- 
eases to cancer (2, 5, 6). Although linked to the 
human CF nups NUP358, NUP214, NUP62, 
NUP88, NUP98, GLE1, NUP42, RAEI, and 
DDX19, the pathophysiology and optimal 
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Fig. 1. Reconstitution of a 16-protein C. thermophilum complex composed of 
coat and CF nups. (A) Cross-sectional schematic of the fungal NPC architecture. 
NE, nuclear envelope. (B) Domain structures of the coat and CF nups. 

(C) Schematic representation summarizing our biochemical reconstitution and 
dissection experiments with purified recombinant C. thermophilum nups, 
illustrating the CFNC architecture and its attachment to the CNC. The CNC harbors 
two assembly sensors, Nup37°"= and Nup145C%"=, each anchoring a CFNC via 
its central hub, with Nup37°'= exhibiting tighter binding than Nupl45C%"". 

(D to F) SEC-MALS interaction analyses, showing the stepwise biochemical 
econstitution starting with (D) the CFNC (green) from Nup82*Nup159+Nsp1 (blue), 
Gle2*Nup145N (cyan), and Dbp5 (red); then (E) CFNC*Glel*Nup42¢8™ (green) from 


Bley et al., Science 376, eabm9129 (2022) 10 June 2022 


CFNC (blue) and Glel*Nup42°®™ (red); and culminating with (F) the 16-protein 
CNC+CFNC+Glel*Nup42°" complex (green) from CNC (red), CFNC (blue), and 
Glel*Nup42°®™ (cyan). SDS-PAGE gel strips of peak fractions are shown. Measured 
molecular masses are indicated, with respective theoretical masses in parentheses. 
(G and H) LLPS interaction assays, assessing (G) CFNC (red) and GlelsNup42°®™ 
(cyan) incorporation into CNC-LLPS (green), and (H) CFNC incorporation into 
CNC-LLPS, lacking either one or both Nup37°'® and Nup145CN"* assembly sensors. 
N-terminally fluorescently labeled CNC (Bodipy), CFNC (Alexa Fluor 647), and 
Glel*Nup42°5 (Coumarin) were visualized by fluorescence microscopy. Pelleted 
CNC condensate phase (P) and soluble (S) fractions were analyzed by SDS-PAGE 
and visualized by Coomassie brilliant blue staining. Scale bars, 10 ym. 
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therapeutic strategies for these conditions 
remain ill defined. 

Here we present insight into the atomic and 
higher-order architecture, function, and mecha- 
nism of action of the CF nups in the human and 
thermophilic fungus Chaetomium thermophilum 
NPCs. First, we uncover a conserved modular 
architecture within the heterohexameric CF 
nup complex (CFNC) of both species: Holding 
the CFNC together is a coiled-coil hub built 
like the CNT but formed by NUP62 with the 
C-terminal regions of NUP88 and NUP214, 
while their N-terminal B-propeller domains 
link to the mRNA export factors NUP98*°RAE1 
and the DEAD-box RNA helicase DDX19, re- 
spectively, which in turn recruit the remain- 
ing complex components. We further uncover 
evolutionarily divergent mechanisms for the 
attachment of the intact CFNC at the cytoplas- 
mic face of the NPC, which in C. thermophilum 
involves two distinct assembly sensors in the 
CNC that do not exist in humans. We assemble 
the C. thermophilum CNC and CF nups into a 
~1.1-MDa 16-protein complex and find that it 
can be remodeled by inositol hexaphosphate 
(IP,). Toward dissecting the molecular mech- 
anism of mRNA export, we systematically char- 
acterize the propensity of CF nups for RNA 
binding and find previously unidentified capabil- 
ities in two CFNC subcomplexes (GLE1leNUP42 
and NUP88*NUP214*NUP98) as well as dif- 
ferent parts of the metazoan-specific NUP358. 
To build a composite structure of the human 
NPC cytoplasmic face, we determine crystal 
structures of the NUP88*7?-NUP98*"” com- 
plex and all remaining structurally unchar- 
acterized regions of NUP358, uncovering a 
previously unobserved S-shaped fold of three 
a-helical solenoids of the NUP358 N-terminal 
domain as well as a complex mechanism for 
NUP358 oligomerization. Docking of the newly 
identified structures, along with previously 
characterized CF nups, into a previously re- 
ported ~23-A cryo-ET map and a new ~12-A 
cryo-ET map of the intact human NPC (pro- 
vided by the Beck group), as well as an ~8-A 
region of an anisotropic single-particle cryo- 
electron microscopy (cryo-EM) composite map 
of the Xenopus laevis cytoplasmic NPC face, 
accounts for all of the asymmetric density on 
the cytoplasmic NPC side resolved in the maps 
(44-46). Validating our quantitative docking 
analysis in human cells engineered to enable 
rapid, inducible NUP358 depletion, we sur- 
prisingly find NUP358 to be dispensable for 
the architectural integrity of the assembled 
interphase NPC and mRNA export but to have 
a general role in translation. Docking of the 
CFNC hub in close proximity to a NUP93 frag- 
ment that, in the inner ring, acts as the assem- 
bly sensor for the CNT allows us to predict and 
experimentally confirm that NUP93 also re- 
cruits the structurally related CFNC on the 
cytoplasmic face, thereby enabling identifica- 
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tion of the elusive human CFNC NPC anchor. 
Thus, our near-atomic composite structure has 
predictive power, demonstrating its general 
utility for the mechanistic dissection of essen- 
tial cellular events that occur on the cytoplasmic 
face of the human NPC. 


Results 
Modular architecture of the evolutionarily 
conserved six-protein CFNC 


Although pairwise interactions between se- 
lected CF nups had previously been reported, 
comprehensive knowledge on the entire CF 
nup interaction network has remained un- 
available (11, 47-64). Using nups from the 
thermophilic fungus C. thermophilum, which 
exhibit superior biochemical stability, we pre- 
viously elucidated the interaction network of 
the 17 symmetric core nups (38). Therefore, we 
first sought to establish the protein-protein 
interaction network and complex stoichi- 
ometry of the eight evolutionarily conserved 
C. thermophilum CF nups: Nup159, Nup82, 
Nsp1, Nup145N, Gle2, Dbp5, Gle1, and Nup42 
(Fig. 1B and fig. S1) (2). Most CF nups contain 
both structured and unstructured regions that 
can harbor multiple distinct binding sites and 
FG repeats. We established expression and 
purification protocols for the C. thermophilum 
CF nups, omitting FG-repeat regions as well 
as an unstructured linker region in Nup145N 
to improve solubility, and analyzed their bind- 
ing by size-exclusion chromatography coupled 
with multiangle light scattering (SEC-MALS) 
and a liquid-liquid phase separation (LLPS) 
interaction assay (Fig. 1, figs. S1 to $26, and 
tables S1 to S6). For a detailed description 
of these experiments, see the supplemen- 
tary text. 

Mixture of Nup82°Nup159*Nsp1 with 
Gle2*Nup145N and Dbpd results in the for- 
mation of a stoichiometric heterohexameric 
CFNC (Fig. 1, C and D, and fig. S4A), which is 
held together by a parallel coiled-coil hetero- 
trimer formed by the C-terminal domains of 
Nup82eNup159«Nspl1, termed the CFNC hub 
(figs. S4 to S6). The CFNC is tethered to the 
NPC by two mutually exclusive assembly sen- 
sors targeting the CFNC hub. These anchor 
points are located within primarily unstruc- 
tured regions [N-terminal extensions (NTEs) 
and C-terminal extensions (CTEs)] present in the 
CNC constituents Nup37™ and Nupl45CN™, 
which supply a strong and weak binding site, 
respectively, permitting two CFNCs to bind a 
single CNC (figs. S7 to S17). The GleleNup42 
complex has also been shown to locate at the 
cytoplasmic face of the NPC, forming an IP¢.- 
dependent interaction with Dbp5 (53, 58, 63, 65). 
We demonstrate stoichiometric incorpora- 
tion of GleleNup42 into both the CFNC and 
CNC*CFNC complexes in the presence of IP, 
(Fig. 1, E and F, and figs. S18 to 23). Addi- 
tionally, we identify an interaction formed 


between GleleNup42 and the CNC that is 
disrupted upon addition of IP¢, establishing 
that the CNC-CF nup interaction network can 
be remodeled (figs. S23 to S25). 

Given the special importance of the CF nups 
in human disease, we next tested whether the 
molecular architecture of the CFNC is evolu- 
tionarily conserved from C. thermophilum to 
humans. The human CFNC is composed of 
NUP88, NUP214, NUP62, NUP98, RAEI, and 
DDxX19. Apart from a rearrangement of the 
FG-repeat and coiled-coil regions in NUP214, 
the domain organization of the human CFNC 
nups is identical to that of the C. thermophilum 
orthologs (Fig. 2 and fig. S27). Indeed, mixing 
the NUP88*NUP214*NUP62 heterotrimer with 
RAEI*NUP98 and DDX19 resulted in a stoichi- 
ometric Homo sapiens CFNC heterohexamer 
(Fig. 2C and figs. S28 and $29). Similarly, a 
systematic pairwise interaction analysis estab- 
lished that the modular CFNC architecture 
characterized in C. thermophilum is conserved 
in humans (Fig. 2, D and E, and figs. S30 to $39). 

Together, our data establish that the CF nups 
form an evolutionarily conserved six-protein 
complex held together by an extensive parallel 
coiled-coil hub generated by the C-terminal 
regions of Nup82/NUP88, Nup159/NUP214 
and Nsp1/NUP62, which shares architectural 
similarities with the heterotrimeric Nsp1/ 
NUP62°Nup49/NUP58*Nup57/NUP54 CNT 
(11). The Nup82/NUP88 N-terminal B-propeller 
domain is attached by an interaction between 
the C-terminal o-helical TAIL fragment of 
Nup159/NUP214 and provides a binding site 
for the Nup145N/NUP98 autoproteolytic do- 
main (APD), which in turn recruits Gle2/RAE1 
to the NPC. Analogously, the Nup159/NUP214 
N-terminal B-propeller domain provides a 
binding site for the DEAD-box helicase Dbp5/ 
DDX19. In C. thermophilum, the CFNC hub is 
anchored to the CNC by two distinct assembly 
sensors in Nup37™ and Nupl45CN®, similar 
to the anchoring of the CNT by the Nic96"/ 
NUP93*' assembly sensor in the inner ring. By 
contrast, the human CNC lacks comparable 
assembly sensor motifs, which suggests that 
there are alternative mechanisms for anchor- 
ing CF nups at the cytoplasmic face of the 
human NPC. 


RNA interactions of human CF nups 


Given their essential roles in mRNA export, we 
next sought to identify which of the human CF 
nups had RNA-binding capabilities (50, 66-69). 
Previous work that used disparate methods and 
diverse but inconsistent probes had established 
DDX19 and RAE1-NUP98@"“"5 binding to Uyo 
single-stranded RNA (ssRNA), degenerate 
ssRNA, poly(A), poly(C), poly(G) RNA, as well 
as ssDNA and double-stranded DNA (dsDNA) 
across a variety of assays (50, 56, 57, 63, 70). 
Taking advantage of our complete set of puri- 
fied human CF nup domains and subcomplexes, 
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Fig. 2. Conserved modular architecture and RNA-binding properties of the 
human CF nups. (A) Cross-sectional schematic of the human NPC architecture. 
(B) Domain structures of human CF nups. Nomenclature of H. sapiens and 

C. thermophilum nup orthologs is indicated. (C) Biochemical reconstitution of the 
~310-kDa heterohexameric human CFNC. SEC-MALS interaction analysis of 
NUP88*NUP214*NUP62 (blue), DDX19 (ADP) (red), RAEL*NUP98 (cyan), and 
their preincubation (green). Measured molecular masses are indicated, with 
theoretical masses in parentheses. SDS-PAGE gel strips of peak fractions 
visualized by Coomassie brilliant blue staining are shown. (D) Summary of 
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on analyses between human CF nups. (E) Schematic 


summary of the human CFNC architecture and the CF nup interaction 
network. (F) Human CF nup domains and complexes were assayed for 
binding to ssRNA and dsDNA probes by EMSA. Input proteins resolved by 
SDS-PAGE were visualized by Coomassie brilliant blue staining. Qualitative 
assessment of nucleic acid binding is denoted by color-coded boxes. (G and 
H) EMSAs with ssRNA titrated against (G) metazoan-specific NUP358N"° 
and NUP358°2"®°"'Y.Ran(GMPPNP), and (H) the indicated H. sapiens CFNC 
subcomplexes and their C. thermophilum orthologs. 
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we carried out a comprehensive electrophoretic 
mobility shift assay (EMSA) screen to systemat- 
ically assess binding against a consistent set 
of ssRNA and dsRNA probes (Fig. 2F). In ad- 
dition to the established RNA binders DDX19 
and RAEleNUP98, we identified ssRNA bind- 
ing by GLE17? -NUP42°"™ and NUP88™7”, 
an activity enhanced in the context of the 
NUP88%7?-NUP98*??-NUP214""™ complex. 
We tested their C. thermophilum orthologs 
and found these RNA binding activities to be 
conserved (Fig. 2, G and H, and fig. S40). Next, 
we analyzed the metazoan-specific NUP358. 
We detected moderate RNA binding for the 
NUP358 N-terminal domain (NTD) (Fig. 2F 
and fig. S40G). Additionally, we found that the 
four NUP358*"8-Ran(GMPPNP) complexes 
preferentially bound ssRNA (Fig. 2F and fig. 
S40E). Additional details of RNA binding can 
be found in the supplementary text. Future 
studies are needed to delineate whether these 
RNA-binding sites present sequence-specific 
RNA affinity and what the implications of 
such specificity would be in the overall mRNA 
export pathway. 


Structural and biochemical analyses of NUP358 


NUP358 is a 3224-residue metazoan-specific 
CF component and the largest constituent of 
the NPC (7-74). Previous studies established 
that its N-terminal ~900-residue o-helical 
region is necessary for nuclear envelope re- 
cruitment (75). Within this region, the first 
145 residues have been biochemically and 
structurally characterized, shown to form 
three tetratricopeptide repeats (TPRs) (76). 
Guided by secondary structure predictions, 
we systematically screened expression con- 
structs for solubility, identifying three frag- 
ments: NUP358N7P41PR (residues 145 to 752), 
NUP358"" (residues 1 to 752), and NUP358'*” 
(an extended region spanning residues 1 to 
832). Subsequent purifications revealed that 
the NUP358N7” and NUP358'°” fragments 
behave differently, with the latter forming 
amorphous precipitates in buffers with NaCl 
concentrations below 300 mM. Therefore, we 
characterized these NUP358 fragments in 
both high-salt (350 mM NaCl) and low-salt 
(100 mM NaCl) buffers wherever possible. 
NUP358N"” exhibited concentration- 
dependent homodimerization in low-salt 
buffer, with measured molecular masses be- 
tween values corresponding to monomeric 
and dimeric species, but existed as a mono- 
meric species in high-salt buffer (Fig. 3, A and 
B, and fig. S41). Conversely, NUP358N7P47PR 
was exclusively monomeric, suggesting that 
the TPR mediates homodimerization (Fig. 
3B and fig. S41). Furthermore, the extended 
NUP358'°” fragment forms oligomers with 
measured molecular masses between those 
of a tetramer and a pentamer (Fig. 3C and 
fig. S41). Subsequent C-terminal mapping 
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revealed an oligomerization element (OE) 
between residues 802 and 832, forming salt- 
insensitive concentration-dependent oligomers 
between dimers and tetramers (Fig. 3G and 
fig. S42). Thus, NUP358 oligomerization is 
mediated by the TPR and OE regions, located 
on opposite sides of the N-terminal a-helical 
region. 

To aid the crystallization of the entire 
NUP3587”, we generated high-affinity syn- 
thetic antibody fragments (sABs) by phage 
display selection (77). By systematically screen- 
ing the generated 62 sABs as crystallization 
chaperones, we identified a NUP3587? esAB-14 
complex that crystallized, enabling de novo struc- 
ture determination of the entire NUP358N'? 
at 3.95-A resolution (tables $7 to S10). To un- 
ambiguously assign the Nup358%"” sequence 
register, we crystallized 17 seleno-r-methionine 
mutants (fig. S43 and tables S11 and S12). 

The asymmetric unit contained two copies 
of the NUP35887?«sAB-14 complex, in one of 
which the first three and a half TPR repeats 
are not resolved. The second copy forms ex- 
tensive interactions with a symmetry-related 
molecule (Fig. 3, D to F, and fig. S44). This 
NUP358"" dimer reveals two alternative TPR 
conformations in which the TPR either forms a 
continuous N-terminal solenoid (open) or folds 
back, separating TPR4 and forming electro- 
static interactions with HEAT repeats 5 to 7 
of the N-terminal solenoid (closed) (Fig. 3F, 
fig. S44C, and Movie 1). Toggling between 
these two states provides a molecular expla- 
nation for the salt-sensitive, concentration- 
dependent dimerization behavior of NUP358N' 
(Fig. 3B). Because the open confirmation is the 
one identified in the intact NPC (see below), 
we focus our description on this state. 

The open conformation of NUP358"” can 
be divided in three sections: an N-terminal 
a-helical solenoid composed of four TPRs and 
four HEAT repeats, a central a-helical wedge 
domain, and a short C-terminal a-helical so- 
lenoid formed by three HEAT repeats (fig. 
$44D). The N- and C-terminal TPR and HEAT 
repeats are capped by solvating helices. In- 
serted between 17 of the N-terminal solenoid 
and 20 of the wedge domain is a ~50-residue 
loop that wraps around the convex face of the 
N-terminal solenoid. The N-terminal solenoid 
and wedge domain form a composite concave 
surface with a pronounced overall positive 
charge (figs. S45 and S46). The central wedge 
domain makes extensive hydrophobic con- 
tacts with the sides of the N- and C-terminal 
solenoids, generating a noncanonical S-shaped 
architecture (fig. S44D). Indeed, a Dali 3D 
search of the Protein Data Bank (PDB) revealed 
that the NUP358“" architecture has not been 
observed previously (78). 

Our biochemical analysis revealed that 
NUP358N"” interacts weakly with NUP88N' 
and has RNA-binding activity, both of which 


were Salt sensitive (figs. S47 and S48). By 
splitting NUP358%"7” into two fragments, 
NUP3587?® and NUP358N"?47"®, we show that 
both halves are necessary yet insufficient for 
either NUP88"” or RNA binding (figs. $47 
and S48). To further map these interactions, 
we performed a saturating NUP358“"” sur- 
face mutagenesis, screening 106 mutants for 
Nupss''” and RNA binding (fig. $49). We 
found that positively charged residues in 
the concave surface mediate binding to both 
NUP88%"” and RNA. Mutations that abolished 
NUP88%"” binding clustered exclusively on the 
N-terminal solenoid, whereas RNA disruption 
required additional mutations in the wedge 
domain. By systematically combining indi- 
vidual alanine substitutions, we identified a 
NUP358N" 2R5K mutation, which abolished 
both interactions (fig. S50). 

Next, we determined the crystal structure 
of NUP358™ at 1.1-A resolution (table $13). 
NUP358°" is a small o-helical element that 
homotetramerizes to form an antiparallel bun- 
dle (Fig. 3, G and H, and fig. S42A). The core of 
the o-helical bundle is lined with hydrophobic 
residues that coordinate oligomeric interhelical 
packing, demonstrated by the monomeric form 
assumed by the NUP358°£ LIQIML mutant 
(Fig. 3G; fig. S42, B to E; and Movie 2). To vali- 
date our NUP358° structure, we tested the 
effect of introducing the LIQIML mutation 
into the larger NUP358%7” °F. Whereas wild- 
type NUP358N"?¥ formed higher-order oligo- 
meric species, the oligomerization profile of 
the LIQIML NUP358N7>-°F mutant matched 
that of the OE-less NUP358%"”, presenting 
concentration-dependent dimerization in 
low-salt buffer but persisting in a monodis- 
perse monomeric state in high-salt buffer 
(fig. S42, F and G). 

Our data show that NUP358‘"” is composed 
of three distinct a-helical solenoids that inter- 
act in a previously unobserved manner, adopt- 
ing a distinctive overall S-shaped architecture 
with a propensity to form domain-swapped 
homodimers. Connected to NUP358N™” by a 
~50-residue linker is an oligomerization ele- 
ment that forms homotetramers/pentamers 
in solution. These dual modes of homo- 
oligomerization provide a plausible explana- 
tion for NUP358’s propensity to form phase 
separation, as observed during NPC assembly 
in Drosophila melanogaster oocytes (79). 


Ran interactions with human asymmetric nups 


Nucleocytoplasmic transport depends on kar- 
yopherin transport factors (Kaps) with direc- 
tionality imposed by a cellular gradient of the 
small guanosine triphosphatase (GTPase) 
Ran; nuclear Ran(GTP) is elevated by a fac- 
tor of ~200 compared with the primarily 
cytoplasmic Ran(GDP) (2, 7, 9). Multiple Ran- 
binding sites are distributed among the asym- 
metric nups at the cytoplasmic and nuclear 
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shown in cartoon and surface representation (left). The inset indicates the location of the magnified and 90°-rotated view of the Ran hydrophobic pocket (middle). Superposition 
of the six NUP3582"F «Ran(GDP) and four NUP153"*+Ran(GDP) cocrystal structures with the Zn**-coordinating cysteines and Ran-burying NTE hydrophobic residues shown 
as sticks (right). (L) Cocrystal structure of NUP353"2"°°"’«Ran(GMPPNP) with NUP353"@"®°"" shown in cartoon representation and Ran(GMPPNP) shown in cartoon (left) or 
surface (middle) representation. Superposition of Ran(GMPPNP) bound to NUP353°2"°°" NUP358RanBO"” NUP35gRanBO- ll” NP35gRa"8OV and NUP5SO""> (right). 


sides of the NPC in the form of distinct Ran- 
binding domains (RanBDs) and Zn?*-finger 
(ZnF) modules. On the cytoplasmic face, 
NUP358 contains four dispersed RanBDs and 
a central zinc finger domain (ZFD) with a tan- 
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dem array of eight ZnFs (Fig. 3A) (80). On 
the nuclear side, NUP153 and NUP50 con- 
tain a central ZFD with four ZnFs and a so- 
litary C-terminal RanBD, respectively (fig. S51) 
(81, 82). 


By testing the Ran(GDP/GTP)-binding ac- 
tivity of all 17 domains by SEC-MALS, we con- 
firmed that all domains bound to Ran, as 
expected, except for NUP3587""! (Fig. 3, I 
and J, and figs. $51 to S55). Consistent with 
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C-terminal 
solenoid 


N-terminal 
solenoid 


Movie 1. Structure of NUP358"". A 360° rotation 
of the NUP358%"-sAB-14 cocrystal structure, 
illustrating the NUP358%"? dimer between 
symmetry-related molecules, followed by a com- 
parison of the two possible TPR conformations 
giving rise to the open and closed states, 
concluding with a 360° rotation of the NUP358‘° 
open confirmation. 
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Movie 2. Structure of NUP358°. A 360° rotation 
of the of the homotetrameric NUP358°° crystal 
structure, with hydrophobic core residues shown in ball- 
and-stick representation followed by an end-on view. 


previous reports, the RanBDs of NUP358 and 
NUP50 only bound Ran(GTP), whereas the 
ZnFs in NUP358 and NUP153 bound Ran in 
both nucleotide states but showed a preference 
for Ran(GDP) (figs. S52 and S54) (80, 83-85). 
To clarify the molecular basis for the differen- 
tial binding behaviors, we determined the 
cocrystal structures of all 16 domains bound 
to Ran in their preferred nucleotide-bound 
state at 1.8- to 2.45-A resolution (figs. S51 and 
S55, Movies 3 and 4, and tables S14 to S16). For 
an expanded description of these structures, 
see the supplementary text. 

Together, our data establish that the human 
CF and nuclear basket nups NUP358, NUP153, 
and NUP50O harbor a total of 16 distinct Ran- 
binding sites that, given their stoichiometry in 
the NPC, could together recruit up to several 
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hundred Ran molecules. Considering the sub- 
stantial size difference between metazoan 
and S. cerevisiae cells, it is conceivable that 
additional Ran-binding sites provided by the 
metazoan-specific asymmetric nups NUP358 
and NUP153 help ensure that Ran concentra- 
tions in the NPC vicinity are high enough to 
enable nucleocytoplasmic transport, as has 
been previously suggested (85). 


Docking of NUP358%" into the cytoplasmic 
unassigned density cluster | 


NUP358 is known to reside on the cytoplasmic 
face of the NPC. Without a structure, its loca- 
tion in the NPC could previously be inferred 
only from the differential absence of un- 
assigned density in an ~38-A cryo-ET map of 
a NUP358-depleted human NPC (44). In the 
accompanying manuscript, we describe how 
quantitative docking of residue-level resolu- 
tion structures into the symmetric core of a 
~12-A cryo-ET map of the intact human NPC 
led to assignment of 16 copies of the symmetric 
nups NUP205 and NUP93 in the cytoplasmic 
outer ring, as well as the identification of 
two clusters (I and I) of unassigned density 
(Fig. 4A), of which the first corresponds to the 
previously observed NUP358-dependent density 
(42). Because of the large size and distinctive 
fold of our newly elucidated NUP358"" crys- 
tal structure, we sought to directly determine 
its position in the intact human NPC (Fig. 4A). 

In our docking analysis, we calculated cor- 
relations between a new ~12-A cryo-ET map of 
the intact human NPC (provided by the Beck 
group) and 1 million resolution-matched den- 
sities simulated from either the open or closed 
conformation of the NUP358N" crystal struc- 
ture, randomly placed and locally fit-optimized 
in the asymmetric unit of the full ~12-A cryo-ET 
map (46). Unlike for the closed NUP358N" 
conformation, docking scores for five place- 
ments of the open NUP358"” conformation 
segregated to high confidence of placement 
and located to the previously unassigned den- 
sity cluster I, leaving no unexplained density 
(fig. S56). We found four copies of NUP3588" 
to be interfaced with the a-helical solenoid folds 
of the CNC components NUP96, NUP107, and 
the distal copy of NUP93°°", wrapping around 
the stalks of the tandem-arranged Y-shaped 
CNCs in pairs, at equivalent distal and proxi- 
mal positions (Fig. 4, B and C). As identified 
in the docking analysis of the symmetric core 
reported in the accompanying manuscript, 
the distal NUP93°°" bisects the stalks of the 
tandem-arranged Y-shaped CNCs by inter- 
facing with the distal NUP107 and the prox- 
imal NUP96 o-helical solenoids, cloistered 
between the four NUP358” copies (Fig. 4D) 
(42). Lastly, the fifth NUP358N”, referred to 
as the dome copy, was docked above the 
other four NUP358N"” copies and the distal 
NupP93°°", with its N and C termini oriented 


toward the C termini of the outer distal and 
inner proximal copies of NUP3587”, respec- 
tively (Fig. 4E). Though unexpected, the place- 
ment of the dome NUP358N"” was the second 
most confident docking solution into both the 
current ~12-A and previously reported ~23-A 
cryo-ET maps of the intact human NPC (44) 
(fig. S57). The placement of 40 molecules of 
NUP358 per NPC is in agreement with pre- 
vious experimental lower-bound stoichiome- 
try estimates of 32 molecules of NUP358 (86). 
Finally, we successfully placed the compos- 
ite structure of the entire cytoplasmic outer- 
ring protomer, including all five NUP358N"> 
copies, into an anisotropic ~7-A region of a 
composite single-particle cryo-EM map of the 
X. laevis NPC cytoplasmic outer-ring protomer 
(figs. S58 and S59) (45). 

The arrangement of five NUP3587? copies 
in each spoke places their C termini in prox- 
imity of each other, projecting the remaining 
domains toward the cytoplasm. Consequent- 
ly, the oligomerization domains of the five 
NUP358%7” copies are constrained to form a 
homomeric assembly within the same spoke 
(Fig. 4, F and G). The oligomerization of NUP358 
observed in the NUP358 crystal structure and 
SEC-MALS analysis would boost the avidity of 
NUP358 attachment to the cytoplasmic face of 
the NPC (Figs. 3, G and H, and 4G). 


NUP358 is dispensable for NPC integrity 
during interphase 


Our quantitative docking showed that 
NUP358""” is the primary attachment point 
for NUP358 at the cytoplasmic face of the 
NPC. To validate this result physiologically, we 
sought to determine the subcellular localiza- 
tion of structure-guided NUP358 fragments in 
intact cells. To prevent default localization of 
ectopically expressed fragments at the nuclear 
envelope because of homo-oligomerization 
with the NUP358° of endogenous proteins, 
we generated an inducible NUP358-knockout 
cell line, in which an N-terminal auxin-inducible 
degron (AID) tag was inserted into both ge- 
nomic NUP358 loci (AID::NUP358 HCT116) 
(fig. S60). Addition of auxin resulted in the 
rapid, selective, and complete degradation of 
endogenous NUP358 within 3 hours, confirmed 
by the loss of immunofluorescent nuclear en- 
velope rim staining and Western blot analysis 
of cellular NUP358 protein levels (Fig. 5A and 
figs. S60 and S61). 

To identify the minimal NUP358 region 
necessary and sufficient for nuclear envel- 
ope targeting, we generated a systematic 
series of hemagglutinin (HA)-tagged N- and 
C-terminal fragments, splitting the protein 
into two pieces after the NTD, OE, RanBD-I, or 
ZFD, and determined their subcellular local- 
ization by immunofluorescence microscopy. 
NUP358 targeting to the nuclear envelope in 
the absence of auxin required both NTD and 
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NUP358 ZnF2*Ran(GDP) NUP358 ZnF3eRan(GDP) NUP358 ZnF4*Ran(GDP) NUP358 ZnF5/6eRan(GDP) NUP358 ZnF7*Ran(GDP) 


NUP358 ZnF8Ran(GDP) 


Nup153 ZnF1eRan(GDP) Nup153 ZnF2*Ran(GDP) Nup153 ZnF3eRan(GDP) Nup153 ZnF4*Ran(GDP) 


Movie 3. Comparison of NUP358 and Nup153 ZnF Ran(GDP) complexes. Crystal structures of the 

six NUP358 ZnF*Ran(GDP) and four NUP153 ZnF*Ran(GDP) cocrystal structures are shown individually 
followed by a superposition. A 360° rotation of the NUP358 and NUP153 ZnF superposition and Ran(GDP) 
is provided, with a zoom-in view showing hydrophobic residues of the ZnF N-terminal extension buried 

in the Ran hydrophobic pocket. Colors are as in Fig. 3K. 


OE regions, whereas all other domains were 
dispensable (Fig. 5B and fig. S62). When 
NUP358"'? or NUP358° were tested in iso- 
lation, neither was found to be sufficient, 
with both domains exhibiting strong nuclear 
staining (Fig. 5B). Introduction of the NUP358 
2R5K mutation, located on the NUP358%7” 
concave surface in contact with the CNC, 
either abolished or severely reduced nuclear 
envelope rim staining when introduced into 
HA-NUP35887?F or HA-NUP358"™", respec- 
tively (Fig. 5B). Analogously, NUP358 oligo- 
merization is required for localization to the 
nuclear envelope, with introduction of the 
oligomerization-deficient LIQIML mutation 
eliminating nuclear envelope rim staining of 
both HA-NUP358N'”-°F and HA-NUP358"” 
(Fig. 5B). Notably, we repeated the fluorescence 
microscopy analysis after auxin-induced de- 
pletion of endogenous NUP358 and obtained 
identical results (fig. S63). 

The previous ~38-A cryo-ET map of the hu- 
man NPC showed loss of the distal cytoplasmic 
CNC ring and potentially other CF nups in 
the absence of NUP358, leading to the conclu- 
sion that NUP358 is required for the integrity 
of the interphase NPC (44). To determine 
whether the architectural stability of the in- 
terphase NPC depends on NUP358, we sought 
to analyze the effect of auxin-induced NUP358 
depletion on the subcellular localization of 
eight nups representative of all NPC sub- 
complexes by immunofluorescence microscopy 
(Fig. 5A). Because previous studies showed 
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NUP358 depletion results in cell-cycle arrest 
at the G-to-M phase transition (87), we first 
monitored the levels of cell-cycle markers to 
determine a cell-cycle length of ~14 hours, 
consistent with previous reports for HCT cells 
(fig. S64) (88). We then induced NUP358- 
degradation in nocodazole-synchronized cells 
and imaged nups at various time points be- 
fore cells entered mitosis and at 24 hours. 
Although NUP358 nuclear envelope rim stain- 
ing was rapidly lost 2 hours after induction 
of degradation and remained absent through- 
out the remaining time points, all eight rep- 
resentative nups continued to display robust 
nuclear envelope rim staining (Fig. 5A). This 
suggests that NPC integrity is not dependent 
on NUP358 attachment to the NPC and also 
demonstrates the specificity of the auxin- 
induced NUP358 knockout. To reconcile the 
apparent conflict between our results and the 
aforementioned cryo-ET study, we investigated 
whether NUP358 depletion led to release of 
nups from the nuclear fraction during cellu- 
lar fractionation. Indeed, we observed auxin- 
dependent leakage of NUP214, NUP88, and 
NUP160 from the nuclear to the cytoplasmic 
fraction (fig. S61C). Curiously, we also con- 
sistently observed a reduction of the nuclear 
basket nup ELYS in the nuclear fraction upon 
NUP358 depletion. 

Together, these data confirm the quanti- 
tative docking of NUP358%"”, validate the 
physiological relevance of NUP358°"-mediated 
bundling, and establish that NUP358 is dis- 


Mg GMPPNP Pe? 


Movie 4. Comparison of NUP358 and Nup50 
RanBD Ran(GMPPNP) complexes. A 360° 
rotation of the of NUP358""8>"'’.Ran(GMPPNP) 
cocrystal structure, colored as in Fig. 3L. A zoom- 
in view is provided, transitioning between the 

four different Nup358 RanBD structures and the 
single NUP50%*"°° structure, showing hydrophobic 
residues of the NUP358 RanBD N-terminal extension 
buried in the Ran hydrophobic pocket. Finally, 

the interaction of individual RanBD basic patches with 
the Ran acidic tail is shown, colored as in fig. S55D. 


pensable for the architectural integrity of 
the assembled interphase NPC, although its 
depletion made the structural integrity of the 
cytoplasmic face of the NPC susceptible to the 
biochemical stresses inherent to cell fraction- 
ation. Future studies are needed to establish 
the extent of NUP358’s role in the formation 
of the double-CNC ring architecture during 
NPC assembly. 


NUP358 plays a general role in translation of 
exported mRNAs 


Export of mRNA from the nucleus to the cyto- 
plasm is an essential step in the expression 
of eukaryotic proteins (Fig. 5C) (43). Our bio- 
chemical analysis revealed that NUP358 has 
multiple RNA-binding domains distributed 
throughout the protein, suggesting a potential 
role in RNA export and mRNP remodeling 
(Fig. 2G). The docking of five NUP358N7> 
copies in the intact human NPC revealed occlu- 
sion of the RNA/NUP88*"”-binding surfaces 
of the dome and inner distal NUP358 copies 
by the CNC stalk but exposure of the remain- 
ing copies’ binding sites (fig. S65). Thus, some 
NUP358%7” copies could potentially be simul- 
taneously attached to the NPC and dynamic- 
ally interact with RNA/NUP88"7”. 

Previous studies had found that efficient 
translation of secretory proteins requires 
NUP358 binding to the ~63-nucleotide GC- 
rich signal sequence coding region (SSCR) of 
mRNAs encoding secretory proteins (89). 
NUP358 knockdown by short hairpin RNAs 
was shown to prevent the translation of var- 
ious secretory protein reporters but had no 
effect on the distribution of mRNA in the cell 
(89). These experiments involved extended 
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Fig. 4. Docking of NUP358"™ on the cytoplasmic face of the NPC. 

(A) Overview of the NPC cytoplasmic face with isosurface rendering of unexplained 
density clusters | (red) and II (cyan) of the ~12-A cryo-ET map of the intact 

human NPC. The inset indicates the location of the magnified view showing cartoon 
representations of five copies of NUP358%" docked in unassigned density cluster I. 
(B) Comparison of the binding of two NUP358N"? copies (cartoon representation) 
to distal and proximal CNCs (surface representation). (€ to E) Architecture of 

the pentameric NUP358""° bundle attachment site on a cytoplasmic outer-ring 
spoke with cartoon- and surface-represented structures (left) and schematics 
(right), sequentially illustrating the placement of (C) four copies of NUP358N"? 
around NUP96 and NUP107 interfaces on the stalks of tandem-arranged Y-shaped 
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CNCs; (D) a distal copy of NUP93°° collocated at the center of the NUP353"° 
bundle, interfacing with both proximal and distal CNC stalks; and (E) the NUP353"° 
dome copy interfacing with the stalk-attached NUP358%" quartet beneath. 

(F) Overview of the entire cytoplasmic face of the NPC in cartoon representation 
and as a schematic, illustrating the distribution of 40 NUP358""° copies anchored 
as pentameric bundles across the eight NPC spokes. (G) Schematic of NUP358 
attachment to the cytoplasmic outer-ring spoke. The NUP358 pentameric bundle is 
linked together by interactions between OEs. Anchored by NUP358%"° the rest 

of the NUP358 domains are linked by unstructured linker sequences and are 
expected to freely project from the cytoplasmic face of the NPC. Distal and proximal 
positions are labeled according to the legend in (A). 


incubation periods and achieved only a partial 
NUP358 knockdown, potentially allowing sec- 
ondary phenotypes to emerge from prolonged 
NPC disruption, non-NPC-related NUP358 ef- 
fects, or defective postmitotic NPC reassembly. 
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With the ability to rapidly deplete NUP358 
in our AID cell line, we sought to examine 
whether NUP358 is directly involved in mRNA 
export or mRNP remodeling by monitoring 
the subcellular distribution of 5-ethynyl uridine 


(5-EU) pulse-labeled, newly synthesized RNA 
after NUP358 depletion. In situ-labeled RNA 
was visualized by fluorescence microscopy 
at hourly intervals during the chase for up to 
6 hours (Fig. 5D). At early time points, 90 
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point) with nuclear RNA 
retention are shown with 
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triplicate experiments. Quan- 
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reporter expression in auxin-treated cells was normalized to expression in control cells at the 10-hour time point. Experiments were performed in triplicate, with 
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experiment. 


to 100% of cells displayed a strong nuclear 
5-EU-labeled RNA signal that decreased 
over time with a concomitant increase in the 
cytoplasmic signal, indicative of RNA being 
exported. After 6 hours, only ~5% of NUP358- 
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depleted cells exhibited nuclear retention of 
labeled RNA, compared with <1% of control 
cells (Fig. 5D). Because of this subtle effect, we 
sought to confirm the result in AJD::NUP358 
DLDI cells (Fig. 5D and figs. S66 and S67). Sim- 


ilar to the AJD::NUP358 HCT116 cell results, 
only ~3% of NUP358-depleted DLDI cells ex- 
hibited nuclear RNA retention after a 6-hour 
chase (Fig. 5D and fig. S68). By contrast, nu- 
clear RNA retention was present in ~20% of 
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NUPI60::NG-AID DLDI cells after depletion 
of NUP160, the knockout of which causes 
RNA retention in S. cerevisiae, demonstrating 
the principle suitability of our experimental 
approach (Fig. 5D and fig. S68) (90). 

Next, we analyzed the fate of the genetic 
message downstream of mRNA export by 
examining the dependence of cellular protein 
expression on NUP358 in AID::NUP358 HCT116 
cells using eight different reporter constructs. 
Synchronized cells were transfected with 
C-terminally FLAG-tagged reporter constructs 
before NUP358 depletion, and the amount 
of reporter in whole-cell extracts was deter- 
mined by Western blot analysis (Fig. 5E and 
fig. S69). First, we focused our analysis on 
representative secretory protein reporter con- 
structs including insulin, interleukin-10 (IL-10), 
IL-6, tumor necrosis factor-a (TNFa), and 
membrane-bound placental alkaline phospha- 
tase (ALPP). NUP358 depletion significantly 
reduced the cellular levels of all five reporters. 
We also tested nonsecretory protein reporters 
and, contrary to the previous observation of 
secretory protein-effect specificity, found that 
NUP358 depletion also significantly reduced 
the cellular levels of ribosomal protein L26 
(RPL26), green fluorescence protein (GFP), 
and histone 1B (H1B) reporters (Fig. 5E and 
fig. S69). 

In summary, our data confirm that NUP358 
depletion does not result in marked nucle- 
ar RNA accumulation, but it nevertheless 
affects the efficient translation of secreted 
and membrane-bound proteins, as previously 
proposed (89). However, our findings also 
demonstrate that the observed translational 
defect is not restricted to secretory proteins, 
which suggests a more general role of NUP358 
in mMRNP-remodeling events that occur at 
the cytoplasmic face of the NPC after mRNA 
export. 


Characterization of NUP358 harboring 
ANE1 mutations 


Acute necrotizing encephalopathy (ANE) is 
an autoimmune disease in which previously 
healthy children experience a cytokine storm 
after common viral infections, resulting in 
brain inflammation and rapid deterioration 
from seizures to coma that can ultimately be 
fatal (97). ANE1, the familial and recurring 
form of ANE, has been associated with four 
distinct NUP358 mutations: T585M, T6531, 
1656V, and W68I1C (97, 92). All four ANE1 
mutations map to the C-terminal o-helical 
solenoid of NUP358"" (fig. $70). Apart from 
T585, which is exposed on the surface, the 
ANE1 mutations locate in the closely packed 
hydrophobic core (fig. S70C). We determined 
cocrystal structures of NUP358N"” harboring 
the individual ANE1 mutations T585M, T6531, 
and I656V in complex with sAB-14, revealing 
no substantial structural changes, with root 
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mean square deviation calculated over 746 C, 
atoms of ~0.5 A (fig. S70 and table S10). 
Moreover, we did not detect differences in 
nuclear envelope rim staining or binding to 
NUP88%"” or ALPP SSCR RNA (fig. S71). No- 
tably, we found that in vitro thermosolubility 
of the W68I1C, T653I, and I656V mutants was 
reduced at temperatures below body temper- 
ature (fig. S72) but increased beyond wild- 
type levels by binding to sAB-14 in all three 
mutants (fig. S73). 

Together, our results indicate that ANE1 
mutations neither directly disturb the fold 
observed in the crystal structure nor affect 
the known cellular functions of NUP358. 
Our observation of a substantially reduced 
thermosolubility of NUP358 ANE1 mutants is 
notable, considering that the sudden onset of 
symptoms appears to require a fever-inducing 
trigger such as a viral infection (91). Future 
studies will be needed to systematically assess 
whether ANE1 mutations affect unknown cel- 
lular functions of NUP358. 


Structural and biochemical analysis of 
NUP83""?-NUP984°?-NUP214"4" 


Besides NUP358, the NUP88%"”*NUP98“??« 
NUP2147“" complex had up to now been 
another CF component for which atomic- 
level structural information had remained 
unavailable. Through extensive screening of 
crystallization fragments and conditions, we 
solved the structure of the heterodimeric 
NUP88%"?-NUP98*"” at 2.0-A resolution 
(Fig. 6H and table S17). Despite low se- 
quence homology, the overall architecture of 
the NUP88%7?-NUP98“"? complex is con- 
served from fungi to humans, although the 
orientation of NUP98“"” relative to NUP88"” 
varies between the cocrystal structures of hu- 
man and fungal orthologs by as much as ~20° 
(figs. S74 to S77 and Movie 5) (11, 59). For a 
detailed description of the structure, see the 
supplementary text (figs. S76 to S81). 

Because the NUP2147“"-NUP88""” inter- 
action was crystallographically intractable, we 
mapped a minimal NUP88%'”-binding region 
spanning NUP214 residues 938 to 955 by 
systematic truncation (figs. S78 and S79). 
NuUpP214"“"" forms a hydrophobic interaction 
with NUP88\7” at the 6CD insertion, which 
was abolished by a combined NUP88%7” LLL 
mutation, analogous to a mutation we had 
previously shown to abolish the interaction 
between the S. cerevisiae orthologs Nupi59™“™ 
and Nup82N"” (fig. $79) (59). Notably, this 
Nupss'"” LLL mutation straddles a naturally 
occurring D434Y mutation in NUP88 that is 
linked to a fatal disorder called fetal akinesia 
deformation sequence, which is associated 
with congenital malformations and impaired 
fetal movement (fig. S80) (93). Given its loca- 
tion, the D434Y mutation is expected to inter- 
fere with the NUP214"“" interaction. 


Combined, our structural and biochemical 
analysis of NUP88, NUP214, NUP98, and their 
interactions shows that their shape, mode of 
interaction, and the overall architecture of 
their complexes are evolutionarily conserved 
from fungi to humans, despite primary se- 
quence divergence. 


Docking of the CFNC into unassigned cluster II 


After the placement of five NUP358N"” copies 
into unassigned density cluster I, we rea- 
soned that the remaining unassigned den- 
sity cluster II would represent the CFNC. 
Unassigned density cluster II is composed of 
two near-perpendicular tube-like segments 
that bisect the NUP75 arms of the distal and 
proximal Y-shaped CNCs—a globular seg- 
ment lodged between the base of the long 
tube-like segment and the proximal NUP75 
arm, and a dumbbell-shaped globular den- 
sity that projects toward the central trans- 
port channel (Fig. 6A). Owing to the small 
size and lack of distinctive shape features, the 
quantitative docking of NUP88\"”-NUP98"”, 
NUP214457?-DDX19, GLE17 ?-NUP4.20=™,, 
GLEI? -NUP42 0 DDX19, RAEI¢NUP98 ES, 
NUP358"*"8?..Ran, NUP3587"" Ran, and 
NUP358°"” into the ~12-A cryo-ET map of 
the intact human NPC, from which all of the 
previously explained density had been sub- 
tracted, did not result in high-confidence solu- 
tions (fig. S82). We therefore took the less 
objective approach of manual placement based 
on shape complementarity and biochemi- 
cal restraints, followed by local rigid-body 
refinement. 

We used the C. thermophilum and X. laevis 
CNT crystal structures, the latter containing 
NUP62, as a template for the polyalanine 
model of the coiled-coil segments (CCSs) 1 and 
2 of the CFNC hub (10, 17). Notably, the CCS1 
and CCS2 models based on CNT structures 
matched the shape and dimensions of two 
near-perpendicular segments of tube-like 
density, which suggests that the CFNC-hub 
coiled-coil architecture is similar to that of 
the CNT (Fig. 6, B and C, and fig. S83A). 
NUP88%"?-NUP98”"” fit best at the base of 
the long CCS1 segment, interfacing with the 
NUP75 arm of the proximal CNC. The ten- 
tative placement would be consistent with the 
biochemically mapped interaction between 
NUP88"?-NUP98“"” and the NUP214™™ seg- 
ment expected to emanate from the C-terminal 
base of the CCS3 segment and thus restrain 
NUP88‘T?*NUP98*?” near the CFNC-hub base 
(Fig. 6, B to E). A dumbbell-shaped density 
interfacing with NUP88%7?-NUP98*"? and 
extending toward the central transport chan- 
nel is consistent with the shape and size of 
the NUP2148"?-DDX19 crystal structure, al- 
though it could also be explained by other 
more transiently tethered components of 
the nucleocytoplasmic transport machinery or 
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Fig. 6. Docking analysis 
reveals that NUP93 anchors 
the CFNC to the cytoplasmic 
outer ring. (A) Overview of 
the NPC cytoplasmic face 
with isosurface rendering of 
unexplained density cluster |I 
(cyan) of the ~12-A cryo-ET 
map of the intact human NPC. 
The inset indicates the loca- 
tion of the magnified view 

in (B). (B and C) Two views of 
manually placed poly-alanine 
models of CFNC-hub 
segments CCS1 and CCS2, 
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white (less than 55% similarity), to yellow (55% similarity), to red (100% identity), using the BLOSUM62 weighting algorithm. (Right) Domain architectures of NUP214, 
NUP88, NUP62, and NUP93. The location of the NUP93 LIL mutation is indicated (red dots). Single-letter abbreviations for the amino acid residues are as follows: 

A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; |, lle; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg: S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. (H) Two views of the 
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cargo (Fig. 6, B and C, and fig. S83A). However, 
the NUP21457”-DDX19 complex forms tighter 
interactions when DDX19 is in its ADP-bound 
state and would therefore be expected to exist 
in the adenosine triphosphate (ATP)-depleted 
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environment of the purified nuclear envel- 
ope (50, 63). The placements of the CFNC- 
hub model and NUP88N"?*NUP98“"? were 
further supported by manual docking into 
an anisotropic ~8-A region of a composite 


single-particle cryo-EM map of the X. laevis 
cytoplasmic outer-ring protomer, although 
the map masking excluded the region that 
includes the dumbbell-shaped density (fig. 
S83B) (45). 
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Movie 5. Evolutionary conservation of the NUP88‘"?-NUP98*P° architecture. A 360° rotation of the NUP88‘">-NUP98*"° cocrystal 
structure and the previously determined crystal structures of S. cerevisiae Nup82\'°*Nup116°"°«Nup159""" (PDB ID 3PBP) (59) and C. thermophilum 
Nup82§"«Nupl45N4P°-Nup159'4"" (PDB ID SCWW) (11), colored as in fig. S77. 


Placement of the CFNC hub into the tube- 
like density puts CCS1, CCS2, and likely the 
unresolved CCS3 within reach of the ~40-A root 
mean square length of the linker that tethers 
NUP93*" to the NUP205-bound NUP93*” 
(Fig. 6D and fig. S84). In the accompanying 
paper, we demonstrated that the NUP93®! 
fragment (residues 2 to 93), like the ortholo- 
gous C. thermophilum Nic96™ assembly sen- 
sor, binds to the CNT complex of the inner ring 
(42). The proximity of the CFNC-hub coiled- 
coil segments to the expected NUP93®" loca- 
tion suggested that NUP93®! might act as 
assembly sensor for the CFNC hub as well. 
Indeed, the NUP93"! fragment formed stable 
complexes with the intact CFNC and CFNC hub 
(Fig. 6F and fig. S85). Notably, the NUP93"" 
LIL mutation that abolished CNT binding (42) 
also abolished the interaction with the CFNC 
hub (Fig. 6, F and G). To ensure that we did not 
miss an interaction of the C. thermophilum 
CFNC (ctCFNC), we evaluated whether Nic96™" 
could bind the ctCFNC hub. In fact, Nic96"™" 
did not bind to the ctCFNC hub, consistent 
with our reconstitution results that identified 
two distinct assembly sensors for the ctCFNC 
in the CNC (fig. S86). These data indicate that 
the long-elusive assembly sensor anchoring 
point of the human CFNC is not provided by 
the Y-shaped CNC, but rather by the NUP205- 
positioned NUP93™", corroborated by the recent 
finding that NUP93 depletion displaces the 
CFNC nups NUP214, NUP88, and NUP62 from 
the nuclear envelope (94). 

A second NUP93*" assembly sensor ema- 
nating from the proximal NUP205-positioned 
NUP93"” represents a potential anchoring site 
for a second, flexibly attached proximal CFNC 
(fig. S84). The placement of 16 copies of the 
CFNC on the cytoplasmic face of the NPC, half 
of which are unresolved in the ~12-A cryo-ET 
map, is consistent with the previously estab- 
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lished stoichiometry (86). Recent in situ ~37- 
and ~34-A cryo-ET maps of the dilated human 
NPC (95, 96) present unexplained elongated 
density near the expected location of the prox- 
imal NUP93*" but could not be further inter- 
preted at the current solutions (fig. S87). 

Finally, we tentatively placed the human 
CF nup GLE17?-NUP42°™ crystal structure 
into a region of unexplained density in the 
~12-A cryo-ET map of the intact human NPC 
between the cytoplasmic bridge NUP155 and 
the cytoplasmic face of the nuclear envel- 
ope, consistent with our previous analysis 
(fig. S88) (63). 


Steric occlusion is insufficient to explain 
asymmetric decoration of the NPC 


Having assigned all cytoplasmic density of 
clusters I and II to NUP358 pentameric bun- 
dles and CFNCs, respectively, we next eval- 
uated whether any structural features prevent 
NUP358 or CFNC mislocalization at the nu- 
clear face of the NPC. We found that unex- 
plained nuclear density adjacent to the NUP160 
arms of the Y-shaped CNCs could be assigned to 
16 copies of the structured N-terminal domains 
of the nuclear basket nup ELYS (fig. S89) (25). 
The ELYS domains did not overlap with nu- 
clear regions equivalent to the sites occupied 
by NUP358 and CFNC on the cytoplasmic face, 
thereby excluding that steric competition with 
NUP358 or CFNC prevents ELYS mislocali- 
zation (Fig. 7, A and B). On the contrary, the 
~12-A cryo-ET map revealed rod-shaped un- 
assigned densities atop the nuclear outer ring 
in regions equivalent to NUP358 sites on the 
cytoplasmic face (Fig. 7C). Analogously, we 
examined whether recruitment of the CFNC 
to the nuclear face was prevented by steric 
hindrance from a nuclear basket component. 
Although the NUP205-NUP93" attachment 
site from which NUP93"" is flexibly projected 


remains unencumbered, an unassigned rod- 
shaped cryo-ET density present on the nuclear 
face overlaps with an area equivalent to the 
CFNC-hub docking sites on the cytoplasmic 
face (Fig. 7C). Together, these findings suggest 
that mechanisms other than steric competi- 
tion alone, such as active nuclear transport of 
asymmetric nups, as previously indicated for 
NUP214 and NUP153 (8/7, 97), are key deter- 
minants of the asymmetric localization of 
NUP358, CFNC, and ELYS. 

Together, our data complete the near-atomic 
composite structure of the symmetric and cyto- 
plasmic asymmetric portions of the human 
NPC (Fig. 8 and Movie 6). 


Conclusions 


Situated on the cytoplasmic face of the NPC, 
CF nups remodel mRNPs as they emerge from 
the central transport channel, ensuring direc- 
tional transport of mRNA and preparing it for 
downstream translation. Given this essential 
cellular function, it is unsurprising that CF 
nups are a hotspot for mutations associated 
with currently incurable diseases, ranging 
from neurodegenerative and autoimmune 
disorders to aggressive cancers. Through a 
comprehensive analysis combining in vitro 
complex reconstitution, crystal structure de- 
termination, quantitative docking, and in vivo 
validation, we established a near-atomic com- 
posite structure of the cytoplasmic face of the 
human NPC. 

Our biochemical reconstitution highlights 
the evolutionary conservation of the CFNC 
modular assembly, which consists of a central 
heterotrimeric coiled-coil hub that tethers two 
separate mRNP-remodeling complexes together. 
Despite the divergence in attachment mecha- 
nisms, the anchoring of two copies of the 
CFNC module to each of the eight NPC spokes 
appears to be an evolutionarily conserved 
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Fig. 7. Comparison of A 
cytoplasmic and nuclear 
faces of the human NPC. 

(A and B) Overall top 

view (left); single-spoke 
protomer with symmetric core 
nups in surface, docked 
asymmetric nups in cartoon, 
and unexplained density 

of the ~12-A cryo-ET map in 
isosurface representation 
(middle); and schematic 
(right) of the (A) cytoplasmic 
and (B) nuclear face of the 
intact human NPC. (C) Super- 
positions of the overall view 
(left) and two orthogonal views 
of single-spoke protomers 
(middle and right) of the B 
nuclear and cytoplasmic faces, 

with hypothetical steric 
clashes between the CFs 

in cartoon representation, and 
the unassigned asymmetric 
nuclear density (cyan) 
indicated. Distal and proximal 
positions are labeled according 
to the legend. Inset boxes 
indicate regions of magnified eye 
protomer views (right). 


Cytoplasmic 
protomer 


Nuclear 
protomer 


architectural outcome: The C. thermophilum 
NPC presents two distinct assembly sensor 
motifs for the CFNC hub in the Nup37 and 
Nup145C subunits of each CNC. The human 
NPC reuses the NUP93 sensor for the assembly 
and anchoring of the CNT in the inner ring as 
an anchor for the CFNC in the cytoplasmic outer 
ring by intercalating two NUP205*«NUP93 
copies among the tandem-arranged CNCs 
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LUnassigned 
Paes density 
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Cytoplasmic face 


NUP 155 (bridge) 


NUP93 SOL@® 
CNC 


@NUP133 


of each spoke. Previous studies have also 
shown that the S. cerevisiae NPC incorporates 
a P-shaped CFNC dimer (6/) to a single site 
within each of the eight outer-ring spokes 
(40, 41). 

In addition to the CFNC, the asymmetric CF 
nup decoration of the human NPC cytoplasmic 
face includes NUP358. Conjoined by an oligo- 
merization element, pentameric bundles of the 


Wdistal 
Aproximal outer 
®@ proximal inner 
Adistal outer 

@ distal inner 
Midome 


NUP3584 


NUP 107! 


UP107— 


distinctively folded NUP358N7> envelop the 
tandem-arrayed stalks of a CNC pair in each of 
the eight spokes. Each attached NUP358N'? 
anchors an extensive ~24.00-residue C-terminal 
region that harbors 14 different domains con- 
nected by unstructured linkers, thereby extend- 
ing as much as ~60 nm from the outer ring 
(98). Our placement of the CNC and CF nups 
explains the entirety of the observed cytoplasmic 
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Fig. 8. Architecture of the A 
human NPC cytoplasmic 
face. Near-atomic composite 
structure of the human 

NPC generated by docking 
individual nup and nup 
complex crystal and cryo-EM 
structures into a ~12-A 
cryo-ET map of the intact 
human NPC, viewed from 

(A) the cytoplasmic face 

and (B) the central 

transport channel as a 

cross section. Newly placed 
structures include the quanti- 
tatively docked NUP358‘'° 
and the manually docked 
NUP83'"?-NUP98*??, 
NUP214N'-DDX19, 
GLEICT*NUP4298™, ELYSNT 
and a CFNC-hub model. 

The nuclear envelope is 
rendered as a gray isosurface. 
Nups are shown in cartoon 
representation and colored 
according to the legend. 


Cytoplasmic 
outer ring 


Nuclear 
outer ring 


face cryo-ET density, accounting for ~23 MDa 
of structured mass. The more flexibly attached 
regions of the CF nups that are not captured by 
the current subtomogram-averaged cryo-ET 
map account for an additional ~19 MDa of mass. 

In addition to these flexibly attached struc- 
tured domains, NUP358, NUP214, NUP98, and 
NUP42 contain extended FG-repeat regions 
emanating from various anchor points at the 
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outer ring. The degree to which these regions 
contribute to the architectural integrity of 
the human NPC, as has been shown for the 
S. cerevisiae NPC (99), and the NPC’s diffusion 
barrier remain important questions for future 
research. 

We found that the interactions between the 
C. thermophilum CF nups and the CNC are 
modulated by the small molecule IP., the 
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Nuclear 
envelope 


Nuclear 
face 


presence of which is required for mRNA ex- 
port. Future studies are needed to address 
the concerted role of posttranslational mod- 
ifications, second messengers and other small 
molecules, and macromolecular factors in 
regulating the assembly and functions of the 
NPC cytoplasmic face in mRNA export. 

The integral membrane proteins and nu- 
clear basket portions of the NPC represent an 
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Movie 6. Overview of the composite structure of the human NPC cytoplasmic face. The structures are 
shown docked into the cryo-ET reconstruction of the intact human NPC, colored according to Fig. 8. A 


pentameric bundle of NUP358‘'° 


is docked followed by relative placement of NUP358° and additional 


NUP358 domains, followed by the CFNC components. An overview of a single-spoke cytoplasmic face 
protomer is shown followed by a comparison cytoplasmic and nuclear face protomers. 


outstanding challenge for structural determi- 
nation. Nevertheless, our analysis has already 
identified that competition for binding sites 
could play a role in the segregation of CF and 
nuclear basket nups to opposite faces of the 
NPC. However, steric occlusion alone is in- 
sufficient to deterministically establish NPC 
polarity, whereby the correct asymmetric nups 
are segregated to either the cytoplasmic or 
the nuclear face, or the proximal NUP93 and 
NUP205 copies are excluded from the nuclear 
outer ring. Nuclear and cytoplasmic eviction 
mediated by the nucleocytoplasmic transport 
machinery is perhaps the most obvious candi- 
date for a mechanism that maintains the polar 
subcellular segregation of asymmetric nups. 
The data presented here provide a compre- 
hensive biochemical foundation and a struc- 
tural framework for the design of future 
experiments aimed at elucidating the multiple 
mechanistic steps involved in mRNP export 
and remodeling. This mechanistic insight will 
be vital for illuminating disease mechanisms 
associated with CF nup genetic variants and 
mechanisms by which viral virulence factors— 
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e.g., SARS-CoV-2 (severe acute respiratory syn- 
drome coronavirus 2) ORF6—hijack the func- 
tions of the NPC (J00). 

Our results represent a substantial step 
toward complete in vitro reconstitution of 
the NPC and establish a near-atomic compos- 
ite structure of the entire cytoplasmic face of 
the human NPC. More broadly, they illustrate 
the effectiveness of our divide-and-conquer 
approach in successfully elucidating the near- 
atomic architecture of an assembly as large 
and complex as the NPC, serving as a para- 
digm for studying similar macromolecular 
machines, which remains a major frontier in 
structural cell biology. 


Materials and methods summary 


Full details of the materials and methods are 
presented in the supplementary materials. 
Briefly, the sources of materials and reagents 
are summarized in table S1. Bacterial, insect, 
and mammalian cell expression constructs 
and conditions are described in tables S2 to S4. 
Proteins were purified using standard chro- 
matography techniques, and purification pro- 


cedures are summarized in table S5. Purified 
proteins and complex formation were charac- 
terized by analytical SEC-MALS, summarized 
in table S6. LLPS of purified protein mixtures 
was analyzed by centrifugal separation followed 
by SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) and Coomassie staining, and by 
fluorescence microscopy after N-terminal 
amino labeling with fluorescent dyes. Nup- 
RNA binding interactions were assayed by 
EMSAs employing either *’P-labeled or un- 
labeled nucleic acid probes, visualized by 
autoradiography or SybrGold-staining, respec- 
tively. Structures were determined by x-ray 
crystallography, with crystallization condi- 
tions and x-ray diffraction data collection, 
processing, and refinement statistics sum- 
marized in tables S7 to S17. Quantitative dock- 
ing was performed by randomly placing and 
scoring densities simulated from crystal struc- 
tures into ~12- and ~23-A cryo-ET maps of the 
intact human NPC (44, 46). Experimental 
structures used to generate the near-atomic 
composite structure of the intact human 
NPC are inventoried in table $18. NUP358 
localization, NPC integrity, RNA export, and 
reporter expression levels were assessed in 
auxin-inducible degron cell lines. 
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Structure of cytoplasmic ring of nuclear pore 
complex by integrative cryo-EM and AlphaFold 


Pietro Fontanat, Ying Dong}, Xiong Pit, Alexander B. Tong}, Corey W. Hecksel, Longfei Wang, 


Tian-Min Fu, Carlos Bustamante, Hao Wu* 


INTRODUCTION: The nuclear pore complex 
(NPC) is the molecular conduit in the nu- 
clear membrane of eukaryotic cells that reg- 
ulates import and export of biomolecules 
between the nucleus and the cytosol, with 
vertebrate NPCs ~110 to 125 MDa in molec- 
ular mass and ~120 nm in diameter. NPCs 
are organized into four main rings: the cyto- 
plasmic ring (CR) at the cytosolic side, the 
inner ring and the luminal ring on the plane 
of the nuclear membrane, and the nuclear 


@ Inner Y 


Combined model 


ring facing the nucleus. Each ring possesses 
an approximate eightfold symmetry and is 
composed of multiple copies of different nu- 
cleoporins. NPCs have been implicated in 
numerous biological processes, and their dys- 
functions are associated with a growing num- 
ber of serious human diseases. However, despite 
pioneering studies from many groups over 
the past two decades, we still lack a full un- 
derstanding of NPCs’ organization, dynam- 
ics, and complexity. 


Cryo-EM structure of the cytoplasmatic ring of the nuclear pore complex from X. leavis. The 6.9 A map was 
generated with single-particle cryo-EM, and the model was built with AlphaFold structure prediction. The 
secondary structural elements guided EM map fitting, resulting in an almost complete model of the complex. The 
approach allowed the identification of five copies of Nup358 and a second copy of the trimeric Nup214-Nup88- 


Nup62 complex. 


Fontana et al., Science 376, 1178 (2022) 
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RATIONALE: We used the Xenopus laevis oocyte 
as a model system for the structural charac- 
terization because each oocyte possesses a 
large number of NPC particles that can be 
visualized on native nuclear membranes with- 
out the aid of detergent extraction. We used 
single-particle cryo-electron microscopy (cryo- 
EM) analysis on data collected at different stage 
tilt angles for three-dimensional reconstruc- 
tion and structure prediction with AlphaFold 
for model building. 


RESULTS: We reconstructed the CR map of 
X. laevis NPC at 6.9 and 6.7 A resolutions 
for the full CR protomer and a core region, 
respectively, and predicted the structures of 
the individual nucleoporins using AlphaFold 
because no high-resolution models of X. laevis 
Nups were available. For any ambiguous sub- 
unit interactions, we also predicted complex 
structures, which further guided model fitting 
of the CR protomer. We placed the nucleoporin 
or complex structures into the CR density to 
obtain an almost full CR atomic model, com- 
posed of the inner and outer Y-complexes, two 
copies of Nup205, two copies of the Nup214- 
Nup88-Nup62 complex, one Nup155, and five 
copies of Nup358. In particular, we predicted 
the largest protein in the NPC, Nup358, as 
having an S-shaped globular domain, a coiled- 
coil domain, and a largely disordered C-terminal 
region containing phenylalanine-glycine (FG) 
repeats previously shown to form a gel-like con- 
densate phase for selective cargo passage. Four 
of the Nup358 copies clamp around the inner 
and outer Y-complexes to stabilize the CR, and 
the fifth Nup358 situates in the center of the 
cluster of clamps. AlphaFold also predicted a 
homo-oligomeric, likely specifically pentame- 
ric, coiled-coil structure of Nup358 that may 
provide the avidity for Nup358 recruitment to 
the NPC and for lowering the threshold for 
Nup358 condensation in NPC biogenesis. 


CONCLUSION: Our studies offer an example of 
integrative cryo-EM and structure prediction 
as a general approach for attaining more pre- 
cise models of megadalton protein complexes 
from medium-resolution density maps. The 
more accurate and almost complete model 
of the CR presented here expands our under- 
standing of the molecular interactions in the 
NPC and represents a substantial step forward 
toward the molecular architecture of a full 
NPC, with implications for NPC function, bio- 
genesis, and regulation. 


The list of author affiliations is available in the full article online. 
*Corresponding author. Email: wu@crystal.harvard.edu 
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Structure of cytoplasmic ring of nuclear pore 
complex by integrative cryo-EM and AlphaFold 


Pietro Fontana'*+, Ying Dong**+, Xiong Pi+?+, Alexander B. Tong*+, Corey W. Hecksel*, 
Longfei Wang’, Tian-Min Fu’>°, Carlos Bustamante*’”, Hao Wu>?* 


The nuclear pore complex (NPC) is the conduit for bidirectional cargo traffic between the cytoplasm 
and the nucleus. We determined a near-complete structure of the cytoplasmic ring of the NPC from 
Xenopus oocytes using single-particle cryo—electron microscopy and AlphaFold prediction. Structures of 
nucleoporins were predicted with AlphaFold and fit into the medium-resolution map by using the 
prominent secondary structural density as a guide. Certain molecular interactions were further built or 
confirmed by complex prediction by using AlphaFold. We identified the binding modes of five copies 

of Nup358, the largest NPC subunit with Phe-Gly repeats for cargo transport, and predicted it to 
contain a coiled-coil domain that may provide avidity to assist its role as a nucleation center for NPC 


formation under certain conditions. 


he nuclear pore complex (NPC) regulates 
nucleocytoplasmic passage of biomole- 
cules and has been implicated in nu- 
merous biological processes, with their 
dysfunctions associated with a growing 
number of diseases (7-6). An NPC is composed 
of multiple copies of more than 30 nucleoporins 


Fig. 1. Cryo-EM map of the X. laevis NPC. 
(A) Cryo-EM density of the X. laevis NPC 
(contour level, 3.0 c) in top and side views, 
shown with CR in cyan, NR in green, 

IR and membrane region in gray, and the 
channel density in magenta. The map 

is eightfold symmetrized and at 19.8 A 
resolution. (B) Cryo-EM density of a 

CR protomer at 6.9 A resolution colored 

by local resolution. (©) Cryo-EM density of 
the X. laevis NPC CR ring (top view; 
contour level, 9.5 c) composed from the 
6.9 A CR protomer map by assuming the 
eightfold symmetry. One of the CR protomers 
is shown in cyan. (D) Cryo-EM density 
(contour level, 4.5 o) of a CR protomer 
superimposed with the final model in 

two orientations and colored by their model 
colors, with inner Y-complex in blue, 

outer Y-complex in green, Nup205 in 
orange, Nup214-Nup88-Nup62 complex in 
purple, Nup358 in red, and Nup155 in cyan. 


Fontana et al., Science 376, eabm9326 (2022) 


(Nups) with structural elements of stacked 
a-helical repeats and/or B-propellers, about 
a third of which also contain phenylalanine- 
glycine (FG) repeat sequences for selective 
transport of cargoes (7-10). The approximately 
eightfold symmetric NPC can be divided into 
the cytoplasmic ring (CR) at the cytosolic side, 


CR top view 
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the inner ring (IR) and the luminal ring (LR) 
on the plane of the nuclear membrane, and the 
nuclear ring (NR) facing the nucleus (Fig. 1A) 
(3, 4, 11-13). Tremendous progress has been 
made toward unveiling the architecture of 
this enormous molecular machine (J7-20). 
Here, we present the cryo-electron microscopy 
(cryo-EM) structure of the CR from Xenopus 
laevis oocytes. 


Structure determination 


We directly spread the nuclear envelopes (NEs) 
of actinomycin D (ActD)-treated X. laevis 
oocytes (78) onto Lacey grids with carbon foil 
on gold support and applied the Benzonase 
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nuclease to remove contaminating chromatin 
(fig. SIA). Cryo-EM data collection was con- 
ducted at different stage tilts and in counting 
mode by use of a K3 detector mounted on a 
Titan Krios microscope at 1.4 A pixel size. 
Representative three-dimensional (3D) plots 
composed of the X and Y positions and the 
defocus levels (AZ) of the NPC particles in se- 
lected tilt images showed the location-dependent 
variation of the defocus values consistent with 
the tilt planes (fig. SIB). Data processing per- 
formed at the bin2 pixel size (2.8 A) gave rise 
to an eightfold averaged full NPC structure, 
subtracted CR structure, and NR structure at 
19.8, 14.6, and 14.7 A resolutions, respectively 
(Fig. 1A, fig. $2, and table S1). Symmetry ex- 
pansion, density subtraction, and 3D classifi- 
cation led to CR and NR protomers at 11.1 and 
15.1 A resolutions. 

Final per-particle refinement and masking 
resulted in maps at 6.9 and 6.7 A resolutions 
for the full CR protomer and a core region, 
respectively (Fig. 1, B and C; fig. S2; and table 
S1). The Fourier shell correlation (FSC) plots 
and 3D FSC plots for both maps are shown 
(fig. S3, A to D), as well as particle orientation 
distributions (fig. $3, E and F). The histo- 
grams of per-angle FSC indicated fairly iso- 
tropic resolutions along different orientations 
(fig. S3, C and D). The map used for density 
interpretation is the 6.9 A resolution map of 
the full protomer. Despite the modest 6.9 A 
resolution of the full CR protomer, the sec- 
ondary structures, especially helices, are ap- 
parent in the maps (Fig. 1, B and C). 


Model building using AlphaFold 


We used the recently implemented break- 
through algorithm for protein structure pre- 
diction (AlphaFold) (2/7, 22), mainly as the 
ColabFold notebook (22) with extended capa- 
bility to predict homo- and heterocomplexes, 
to build a nearly complete model of the CR 
protomer (fig. S4), which contains the inner 
and outer Y-complexes, two copies of Nup205, 
two copies of the Nup214-Nup88-Nup62 com- 
plex, one Nup155, and five copies of Nup358 
(Fig. 1D). 

Because no high-resolution models of X. laevis 
Nups were available, the workflow first in- 
volved prediction of five independent models 
of individual Nups, which in almost all cases 
gave essentially the same structures (tables 
82 and S3). For each prediction, we present 
the overall and per-residue pLDDT (predicted 
local distance difference test; 0 to 100, with 
100 being the best), the pTM (predicted tem- 
plate modeling; 0 to 1, with 1 being the best), 
and the predicted alignment error (PAE) 
matrix (expected position error at residue x 
when the predicted and true structures are 
aligned on residue y, representing confidence 
of the relative positioning of each pair of res- 
idues or domains) (tables S2 and S3). We picked 
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the top-ranked model by pLDDT for single 
proteins and by pTM for complexes in each 
case for density fitting unless otherwise noted. 
Whereas helical Nups used the prominent 
helical features in the maps for their fitting, 
Nups with mainly a B-propeller domain re- 
quired prediction of binary complexes with 
contacting helical Nups to guide the fitting 
(table S4). Last, for any ambiguous subunit 
interactions, we predicted complex structures, 
which further guided model fitting of the CR 
protomer (table S4). X. laevis Nups that have a 
substantial region not covered by homology to 
structural homologs in other species include 
Nup107, Nup133, Nup160, Nup205, and Nup358 
(tables S5 and S6 and fig. S5). 


The Y-complex 


The CR contains 16 copies of the Y-shaped com- 
plex (Y-complex), encircling head to tail to form 
the inner and outer layers of eight Y-complexes 
each in the ring (Fig. 1D) (23). Each Y-complex 
is composed of Nup160 and Nup37 (one short 
arm); Nup85, Nup43, and Seh1 (the other 
short arm); and Nup96, Sec13, Nup107, and 
Nup133 (the long stem) (Fig. 2A). Structural 
superposition revealed conformational dif- 
ferences between inner and outer Y-complexes 
at near Nup133 (Fig. 2B and Movie 1), likely 
because of the need to accommodate the dif- 
ferent diameters at the inner and outer layers. 

The AlphaFold-generated Nup160 structure 
fits well with the density of the inner and outer 
Y-complexes (Fig. 2C, fig. S5A, and tables $2 
and S3). By contrast, the published homology 
model of X. laevis Nup160 [Protein Data Bank 
(PDB) ID 6LK8] (J4) misses a C-terminal re- 
gion (Fig. 2C), which may have led to the in- 
correct assignment of its density to Nup96 
(Fig. 2C and fig. S5B) (4). Thus, building full- 
length models with AlphaFold may not only 
increase the structural accuracy of the indi- 
vidual subunits but also help to better assign 
and interpret densities. 

How f-propeller Nups in the Y-complex— 
Nup37, Nup43, Sehl, and Secl3—fit in the CR 
map cannot be easily discerned. We therefore 
predicted structures of these Nups in complex 
with their contacting a-helical Nups. Seh1- 
Nup85, Nup43-Nup85, and Sec13-Nup96 com- 
plexes were all predicted with excellent pTM 
and pLDDT scores and fitted the cryo-EM den- 
sity as a rigid body (Fig. 2D; fig. S5, C and D; 
and table S4). The Seh1-Nup85 and Sec13-Nup96 
complexes exhibited hybrid B-propeller struc- 
tures in which an insertion blade from the 
interacting helical domain completes the seven- 
bladed propellers (Fig. 2E and fig. S5D), as also 
observed in previous crystal structures of the 
corresponding, but partial, yeast and human 
complexes (24-26). AlphaFold failed to predict 
the Nup37-Nup160 complex (fig. S5E) (27), and 
we instead used the crystal structure to guide 
the Nup37 positioning in the map. 
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Nup205 and the Nup214-Nup88-Nup62 complex 
Two AlphaFold-generated Nup205 models, 
which are larger than and quite different from 
the homologous crystal structure (28), were 
each fitted well at the channel side of the two 
Y-complexes to act as a bridge between them 
(Fig. 3A; Movie 2; fig. S6, A and B; and tables 
S5 and S6). The outer Nup205 runs from the 
C-terminal part of Nup160 to Nup85, and the 
inner Nup205 interacts with Nup160 at its 
N-terminal domain but tilts away from Nup85 
at its C-terminal domain because of the inter- 
action with the neighboring Nup214-Nup88- 
Nup62 complex (Fig. 3, A and B). 

We fit a prominent, flag-shaped density over 
inner Nup85 and extending to the outer Nup85 
by generating a composite model of the Nup214- 
Nup88-Nup62 complex (fig. S6C). The three 
proteins have been previously predicted to 
form coiled-coil interactions (4, 29-32). Accord- 
ing to AlphaFold, Nup88 and Nup214 also con- 
tain B-propeller domains, and complex prediction 
confirmed the coiled coils and agreed well 
with the CR map: the f-propeller of Nup88 
and one end of the helical bundle as the flag 
base, the long helical bundle as the flagpole, 
and the shorter helical bundle as the banner 
(Fig. 3C). By contrast, the previous X. laevis CR 
structure presented only a polyalanine model 
for this complex (fig. S6D) (14). The B-propeller 
domain of Nup214 does not have density, likely 
because of a flexible linkage. A given Nup85 can 
only bind to either Nup205 (for outer Nup85) 
or the Nup214-Nup88-Nup62 complex (for in- 
ner Nup85), but not both (Fig. 3, A and D), 
which explains the differential modes of Nup205 
interactions with the Y-complexes. 

We noticed another piece of nearby density, 
which was previously suggested as a second 
Nup214-Nup88-Nup62 complex (/4) and was 
fitted as such in a recent paper (20), which is 
in agreement with the expected stoichiometry 
from mass spectrometry data (73). Our density 
fit well with the flag base (Fig. 3D). However, 
the flag pole is largely missing. We do not 
know whether this is due to a partial disorder 
of this region or a lower occupancy of the sec- 
ond complex as a result of ActD treatment in 
our sample. The Nup88-Nup214-Nup62 com- 
plex resembles the X. laevis Nup54-Nup58- 
Nup62 complex anchored by Nup93 of the 
IR or yeast Nup49-Nup57-Nsp1 complex in its 
coiled-coil region (fig. S6C) (33, 34), suggesting 
that coiled-coil structures are frequently used 
building blocks in NPC assembly. 


The five copies of Nup358 


The largest protein in the NPC, Nup358 (also 
known as RANBP2, or RAN binding protein 2), 
is composed of a largely disordered C-terminal 
region with FG repeats for gel-like phase for- 
mation and selective cargo passage and with 
binding sites for RANGAP, RAN, and other ef- 
fectors (Fig. 4A) (7, 8, 23, 35). AlphaFold predicted 
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Fig. 2. Fitting of Y-complex Nups with AlphaFold. (A) Cryo-EM density 
(contour level, 8.0 o) of the outer Y-complex colored by individual Nups. 
The B-propeller domain of Nup133 was not built because of lack of density. 
(B) Two views of superimposed inner Y-complex (blue) and outer Y-complex 
(green) by the two short arms of the Y-complexes. The distal ends of 
aligned Nup133 without counting the B-propeller have a distance of ~38 A. 
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(C) Comparison of AlphaFold prediction (left) and homology modeling (right) 
for Nup160. The cryo-EM density (contour level, 4.5 o) and the positioning 
of Nupl60 (yellow) and Nup96 (cyan) by the two predictions are shown 

at bottom. (D and E) AlphaFold-generated model of the Nup&85-Sehl complex 
fitted with (D) the cryo-EM density (contour level, 4.5 o) and shown to 
highlight the inserted blade. 


3 of 11 


RESEARCH | 


STRUCTURE OF THE NUCLEAR PORE 


Movie 1. Conformational difference between inner and outer Y-complexes. The movie shows models of 
the complete Y-complexes, from 90° rotation around the horizontal axis to transition between conformations 
of the outer and inner Y-complexes, with the main difference at Nup133. Details are reported in Fig. 2. 


the Nup358 N-terminal region as having a 
large a-helical domain (~800 residues), a 
linker, and an isolated single a-helix (Fig. 4, 
A and B). Previously, only the structures of a 
small N-terminal region (~150 residues) of hu- 
man and chimpanzee NUP358 were solved (36) 
and used for homology modeling in X. laevis 
NPC (fig. S7A and tables S5 and S6) (14). The 
Nup358 globular domain is an S-shaped struc- 
ture, and we identified five copies of Nup358 
in the CR map (Fig. 4C and fig. S7B), which is 
consistent with the previous understanding 
of Nup358 as one of the most abundant pro- 
teins in the NPC (Fig. 4C and fig. S7B) (4). 

The full model of Nup358 molecules shows 
that four of the copies clamp around the in- 
ner and outer Y-complexes near the junction 
of Nup96 and Nup107 (Fig. 4, D and E, and 
Movie 3), likely to stabilize the CR. In the outer 
Y-complex, clamp A contacts Nup96 and Nup107 
with ~750 and 400 A? buried surface area, re- 
spectively, and clamp B contacts Nup107 with 
~630 A? buried surface area, as calculated on 
the PDBePISA server (37). In the inner Y-complex, 
clamp C contacts Nup96 with only ~270 A? bu- 
ried surface area, and clamp D interacts with 
Nup107 with ~750 A’ buried surface area. Super- 
position of the inner and outer Nup96-Nup107 
complexes showed that clamps B and D both 
contact Nup107 in a similar mode of binding, 
but clamps A and C are shifted significantly to 
account for the differences in the surface area 
burial (Fig. 4F). The fifth Nup358 (clamp E), 
situating in the center of the Nup358 cluster, 
contacts clamp C (~1700 A’) and Nup107 
(~600 A’) of the outer Y-complex. Thus, the 
apparent weaker interaction to the Y-complex 
by clamp C is compensated by the additional 
interaction from clamp E. 
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Homo-oligomeric Nup358 

We wondered whether the predicted isolated 
helix (Fig. 4B) following the S-shaped domain 
forms a coiled-coil structure, which is how- 
ever invisible because of its flexible linkage. 
We thus used the COILS sever (38), which pre- 
dicted up to 100% coiled-coil propensity for 
this helix (Fig. 5A). We then used AlphaFold 
to predict how the helix would assemble into 
oligomers. We input the number of protomers 
as six because coiled-coil structures with more 
than five subunits are very rare, and six should 
cover almost all possibilities. AlphaFold pre- 
dicted a pentameric coiled coil plus a single 
helix as the top-ranked model with a pTM of 
0.74 and pLDDT of 82.2. This is then followed 
by two trimeric coiled-coil complexes with pTMs 
of 0.45 and 0.44, a tetramer and a dimer witha 
pTM of 0.57, and last, a hexameric coiled coil 
with a pTM of 0.39 (Fig. 5B). The pentameric 
coiled coil also had the highest per-residue 
pLDDT scores at its core region (bluest) when 
displayed onto the structure (Fig. 5C). 

To corroborate the AlphaFold prediction, 
we expressed and purified His-tagged_X. laevis 
Nup358 (1 to 800, only the globular region) and 
Nup358 (1 to 900, with the coiled-coil region) 
and subjected them to gel filtration chroma- 
tography. Judging by gel filtration standards 
from the same column, Nup358 (1 to 800) may 
be consistent with a monomer, whereas Nup358 
(1 to 900) may be consistent with a pentamer 
(Fig. 5D). A pentameric Nup358 (Fig. 5E) may 
help its interactions with the Y-complexes 
through avidity, although the potential forma- 
tion of other oligomers cannot be excluded. A 
recent preprint reported an antiparallel tetra- 
meric crystal structure of the coiled-coil region 
of human NUP358 (39), suggesting that Nup358 
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from different species may assume different 
modes of oligomerization. 

A recurrent human mutation of NUP358, 
Thr®*°—Met (T585M) (equivalent to X. laevis 
T584M), is associated with autosomal-dominant 
acute necrotizing encephalopathy (ADANE) 
(40, 41). Thr” is mapped to a partially buried 
site in direct interaction with the hydrophobic 
side chain of Leu*” (fig. $7C), suggesting that 
the mutation might affect the conformation of 
the structure and reduce its interaction with the 
Y-complexes. The dominant nature of this 
presumed loss-of-function mutation is con- 
sistent with the multimeric nature of Nup358 
in which the mutant co-oligomerizes with the 
wild-type protein to reduce the avidity for its 
interaction with the Y-complexes. 


Nup155 and unassigned densities 


Previously, a cryo-electron tomography (cryo- 
ET) study of human NPC showed localization 
of NUP155, a linker Nup, in both the CR and 
the NR (16). The AlphaFold-predicted Nup155 
structure consists of a B-propeller followed by 
a large helical repeat domain (Fig. 6A), in an 
organization similar to that of Nup160 and 
Nup133. The helical repeat domain fits well 
with the CR protomer map (Fig. 6B) and in- 
teracts with inner Nup160, burying ~750 A? 
surface area, and with inner Nup205, burying 
~310 A” surface area (Fig. 6C). We wondered 
whether we masked out the density for the 
6-propeller during high-resolution refinement. 
The full CR map from a previous step of data 
processing (fig. S2) revealed density for a 
complete Nup155 (Fig. 6D). In this map, the 
B-propeller of Nup155, the neighboring inner 
and outer Nup160, and inner Nup133 situate 
inside a membrane region of the density (Fig. 
6D). The B-propeller domains of Nup155 and 
Nup133 have been shown to possess a membrane- 
anchoring domain known as amphipathic lipid 
packing sensor (ALPS) (42, 43), which consists of 
a short, disordered loop that may fold into an 
amphipathic helix on membrane (44). 

We could not assign the identity of a piece 
of elongated density next to inner Nup205, 
Nup133, and Nup107 (fig. SSA). This density was 
absent from a previously deposited cryo-EM 
map of X. laevis CR (/4) but was present in the 
deposited cryo-ET maps of X. laevis NPC treated 
or not with ActD (fig. S8B) (78). Another smaller 
piece of unassigned density situates adjacent 
to Nup358, inner Nup96, and outer Nup107 
(fig. S8A). The location of this density could 
be explained by Nup93 as suggested by a re- 
cently released paper and a preprint (20, 39). 
However, we were unable to properly fit Nup93 
because of the weaker density. 


Conclusion 


Our nearly complete model of the CR of X. laevis 
NPC reveals the molecular interactions within 
and their biological implications. One aspect 
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of the CR assembly that was unexpected is 
the observed asymmetry in the composition 
and mode of binding among Nups: the con- 
formational differences between the two Y- 
complexes, the different binding modes of the 
two Nup205 molecules with the Y-complexes, 
the two Nup214-Nup88-Nup62 complexes side 
by side, and the five Nup358 complexes with 
contrasting binding modes. It will be inter- 
esting to know whether this asymmetry repre- 
sents a basal state of the CR or is caused by 
ActD-mediated cargo deficiency, and whether 
it will be acommon feature in the structures of 
the NR, IR, or LR. Our X. laevis NPC sample 
came from haploid oocytes, which may differ 
further from NPCs in somatic cells. 

We propose that the multiple copies of 
Nup358 and its oligomeric coiled-coil asso- 
ciation explain its implicated role as a key 
driver of NPC assembly during oogenesis in 
the cytosol that is different from the rapid 
postmitotic and the slower interphase NPC 
assembly (2). This process occurs on stacked 
membrane sheets of the endoplasmic reticu- 
lum (ER) termed annulate lamellae (AL), and 
Nup358 condensates from its FG repeats act as 
a fastener to spatially direct this NPC bio- 
genesis from scratch (2, 45). The additional 
requirement for the FG-containing Nup214 in 
Nup358 recruitment to the NPC (46) further 
suggests a role of condensation in NPC as- 
sembly. The oligomeric structure of Nup358 
may lower the threshold for Nup358 conden- 
sation, thus helping to explain its nucleating 
role among the different Nups. 

We also present an integrative approach to 
take advantage of the recent developments 
in cryo-EM technology (47, 48) and AlphaFold 
structure prediction (2/, 22, 49), which led to 
a more precise modeling of the NPC. Similar 
approaches were also used in the structure 
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determination of NPCs in recently published 
papers or preprints (19, 20, 50-52). AlphaFold 
prediction is in contrast to structure model- 
ing by means of homology to deposited struc- 
tures that are often partial or quite dissimilar. 
The goal of achieving high resolution is to 
obtain the best model possible; incorporating 
information from AlphaFold in the modeling 
process may be analogous to what the field did 
previously for stereochemical restraints (53). 
With the capability for complex prediction to 
become more routine (22, 54, 55), we anticipate 
that this approach will not only assist the mod- 
eling of new structures but also help to reinter- 
pret previous medium-resolution cryo-EM maps 
and become a norm in structural biology. 


Materials and methods 
Sample preparation for cryo-EM 


X. laevis has played a key role in revealing 
the NPC structure because each oocyte has a 
large number of NPC particles (71, 14, 15, 18, 56). 
Freshly isolated stage VI oocytes of X. laevis 
in the modified Barth’s saline (MBS, 10 mM 
HEPES at pH 7.5, 88mM NaCl, 1mM KCI, 
0.82 mM MeSsO,, 0.33 mM Ca(NO;), and 0.41 mM 
CaCl.) were purchased and shipped overnight 
from Ecocyte Bioscience US LLC. To optimize 
the homogeneity of the NPC sample, we incu- 
bated these oocytes with 100 ug/ml Actinomy- 
cin D (ActD) at 4°C overnight to inhibit RNA 
synthesis and thus RNA export for synchroni- 
zation of the transport cycles (8). Each oocyte 
was poked at the animal pole using a sharp 
tweezer to result in the ejection of the nucleus, 
and transferred into a low salt buffer contain- 
ing ActD (LSB, 10 mM HEPES at pH 7.5, 83 mM 
KCl, 17mM NaCl and 7.5 ug/ml ActD). The nu- 
cleus was further washed to reduce the contam- 
inating yolk in a new LSB solution. Two or 
three washed nuclei were then transferred 
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to the surface of a freshly glow-discharged 
grid. The NE was poked open, spread using 
glass needles, incubated for 10 min in 10 ul of 
LSB supplemented with Benzonase Nuclease 
(Sigma Aldrich, E8263) to remove the contam- 
inating chromatin, and subsequently washed 
twice with 10 ul of LSB. 3 ul LSB was added to 
the grid before blotting it for 3 to 5 s under 100% 
humidity at 4°C and plunged into liquid ethane 
using a Mark IV Vitrobot (ThermoFisher). 


Negative staining EM 


Nuclear membranes were applied on a freshly 
glow-discharged grid, using a Pelco EasyGlow, 
as described for cryo-EM sample preparation. 
Excess buffer was blotted on filter paper, and 
6 ul of a 1% uranyl formate solution was ap- 
plied for 30 s and blotted again on filter paper. 
Negatively stained samples were imaged on a 
Joel JEM1400 Transmission Electron Micro- 
scope at 120 keV. 


Cryo-EM data collection 


Screening and collection were performed at 
Stanford-SLAC Cryo-EM center (S2C2) with 
a Titan Krios electron microscope (Thermo 
Fisher Scientific) operating at 300 keV equipped 
with a K3 detector and a BioQuantum energy 
filter (Gatan, slit width 20 eV). Movies were 
collected in counting mode at a 1.4 A pixel 
size (table S1). Because of the way the grids 
were made, most NPC particles would have a 
similar orientation with their eightfold axis 
perpendicular to a grid, and we were expected 
to use a series of stage tilt angles to alleviate 
this orientation bias for 3D reconstruction. 
Given the known knowledge that gold grids 
can minimize beam-induced movement (57), 
we tested a number of gold grid types with 
the goal of identifying one with smallest beam- 
induced movement that is often exaggerated 
at high tilt angles. These grids include Lacey 
carbon films on gold support, 300 mesh (Ted 
Pella), Quantifoil holey carbon films on gold 
support, R 1.2/1.3, 300 mesh (Quantifoil Micro 
Tools), UltrAuFoil holy gold films on gold 
support, R 1.2/1.3, 300 mesh (Quantifoil Micro 
Tools) and UltrAuFoil holy gold films on gold 
support overlaid with graphene (made by 
Wei Li Wang in the Wu lab). Lacey carbon films 
on gold support were shown to be the most 
stable and thus used for all data collection. 

To alleviate the orientation bias, we initially 
collected datasets at stage tilts of 0° 35° and 
45° with a total dose of 54 e/A? over 40 frames 
for 0° and 35°, and a total dose of 79.8 e/A? 
over 60 frames for 45°. An ideal tilt angle of 
42° was then calculated using cryoEF (58) from 
a preliminary 3D reconstruction, and was used 
for the subsequent data collection with a total 
dose of 80 to 140 e/A? over 80 to 120 frames. 
SerialEM was used for fully automated data 
collection, with a defocus range between -1 
and -3 um. 
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Fig. 4. Nup358 interacts with the Y-complexes as clamps. (A) Domain 
organization of X. laevis Nup358 and the approximate boundaries. ZnFs, zinc 
fingers. (B) AlphaFold-predicted structure of the N-terminal region of Nup358, 
showing the S-shaped globular domain, an isolated helix, and the flexible 

linker in between. (C) Fitting of Nup358 globular domain to the density (contour 
level, 8.0 c). (D) The region of the map (contour level, 8.0 «) containing five 
Nup358 molecules (labeled as clamps A to E) and two Y-complexes (Nup96- 
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Clamp CD for the 
Inner Y-complex 


180° 


we & D 
Inner Nup107 ¢ 


Aligned Clamp © 
Nup96 


i AW, Clamp C 


Clamp AB for the 
outer Y-complex 


Nupl07 complex), in two orientations. (E) Two Nup358 molecules each clamp 
around Nup96-Nup107 at the inner and outer Y-complexes. Clamps A and B 
(red) are for the outer Y-complex, and clamps C and D (pink) are for the 

inner Y-complex. The last Nup358 (clamp E, orange) contacts clamp C and 
Nup107 of the outer Y-complex. (F) Relative shifts in the clamp location on 

the two Y-complexes. The clamps B and D are similar in their location on Nup107, 
whereas clamps A and C have a shift in their position on Nup96. 
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Movie 3. Interactions of Nup358 with the Y-complexes. The movie shows five copies of Nup358 and their 
interactions with inner and outer Nup96 and Nupl07. The model zooms in to the five Nup358 clamps and 
then rotates 75° along the horizontal axis. Detailed interactions are reported in Fig. 4. 
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Fig. 5. Nup358 is predicted to contain an oligomeric coiled coil. (A) Prediction of the single helix after the 
S-shaped globular domain for coiled-coil propensity by using a sliding window of 14, 21, or 28 residues. (B) The 
ranked five models of six Nup358 coiled-region protomers predicted with AlphaFold and the associated pTM 
and average pLDDT scores. The top model contains a pentamer and a monomer, suggesting that the pentamer is 
the most favorable oligomer. (C) Ribbon diagrams of four models from (B) (ranked at 1, 2, 4, and 5) and colored by 
per-residue pLDDT scores. A light spectrum from blue to red corresponds to highest to lowest pLDDT scores, 
respectively. (D) Elution fractions of X. laevis Nup358 (1 to 900, top) and Nup358 (1 to 800, bottom) from a gel 
filtration column. The elution positions of several standards are shown. aa, amino acid. (E) The ribbon diagram 
of a pentamer colored by each protomer and shown in side and top views. 


Fontana et al., Science 376, eabm9326 (2022) 10 June 2022 


Cryo-EM data processing 

Data processing leveraged computer support 
from the SBgrid Consortium (59). Movies were 
corrected by gain reference and beam-induced 
motion, and summed into motion-corrected 
and dose weighted images using the Relion 
3.08 implementation of the MotionCor2 algo- 
rithm (60, 61). The distribution of average mo- 
tions per frame for each grid type at a given tilt 
angle was plotted using OriginLab (OriginPro 
2017 Suite, OriginLab Corporation, Northampton, 
MA, USA) to evaluate grid-dependent drift 
performance. 

The initial contrast transfer function (CTF) 
estimation of motion-corrected micrographs 
without dose-weighting was calculated by 
CTFFIND4 (62). All micrographs were manu- 
ally inspected and selected based on particle 
uniformity and contrast, and particles were 
picked manually. Gctf (63) was then used to 
determine the per-particle defocus values (63), 
from which 3D plots composed of the X and 
Y coordinates and the CTF (Z) of the particles 
for selected tilt images were generated using 
OriginLab (OriginPro 2017 Suite, OriginLab 
Corporation, Northampton, MA, USA). A plane 
was then fit to each 3D plot of a given image 
(fig. SIB). 

A total of 204,551 particles were manually 
picked, local CTF-corrected and extracted from 
30,987 dose-weighted micrographs using a box 
size of 330 by 300 pixels at a 4x binned pixel size 
of 5.6 A in RELION 3.08 (62). These particles 
were imported into cryoSPARC (64) to perform 
2D classification, from which 124,532 good 
particles were selected and merged for homo- 
geneous refinement. The published cryo-EM 
map of the human NPC (EMD-3103) (16) was 
low-pass filtered to 60 A and used as the initial 
model. The homogeneous refinement with 
C8 symmetry resulted in a reconstruction at 
22.1 A. These reconstructed 124,532 particles 
were exported to RELION, 3.08 extracted again 
with a box size of 660 by 660 pixels and a 
binned pixel size of 2.8 A, and imported back 
into cryoSPARC to re-perform 2D classification. 
101,366 particles were selected for homoge- 
neous refinement using the 22.1 A map low- 
pass filtered to 40 A as the initial model. The 
homogeneous refinement with C8 symmetry 
resulted in a 19.8 A map. Particle density sub- 
traction with the aligned 101,366 particles for 
separate processing of the CR or the NR was 
done in cryoSPARC. The new local refinement 
in cryoSPARC using the subtracted particles 
and a NR ora CR mask led to NR and CR maps 
at 14.7 and 14.6 A resolutions, respectively. 

The aligned 101,366 particles for the whole 
NPC were also exported to RELION 3.08 and 
ran auto-refine with local search and C8 sym- 
metry, with the 19.8 map low-pass filtered to 
30 Aas the initial model. The resolution of the 
auto-refined map was 19.5 A. We then per- 
formed C8 symmetry expansion and density 


8 of 11 


RESEARCH | 


STRUCTURE OF THE NUCLEAR PORE 


Outer-Nup160 Inner-Nup133 


Inner-Nup160 


Membrane envelope 


’ Nup155 CD 


Fig. 6. Nup155 and other membrane-anchoring domains in the CR. (A) AlphaFold-predicted full-length 
Nup155. (B) Fitting of the C-terminal region of Nup155 into the cryo-EM density (contour level, 4.5 o). 
(C) Interaction of Nup155 with the neighboring inner Nupl160 and Nup205 (contour level, 4.5 o). 

(D) B-propeller domains of Nup155, Nup133, and Nup160 all localize to the membrane envelope region 


of the cryo-EM density map of NPC CR full ring at 14. 


subtraction using a CR protomer mask, and 
these subtracted particles were recentered and 
box size re-windowed to 300 by 300 pixels, all 
in RELION 3.1. 3D classification using a CR 
protomer mask, local search with 50 iterations 
and K = 6 was done on these subtracted par- 
ticles. A class with 333,214 particles was se- 
lected for auto-refine with a mask and local 
search, reaching an 11.1 A resolution. CTF 
refinement accounting beam-tilt estimation, 
anisotropic magnification estimation and per- 
particles defocus estimation and the subse- 
quent auto-refine resulted in an improved map 
at 9.9 A resolution. Additional reconstructions 
using a tight CR protomer mask or a tight core 
region mask led to maps at 8.8 and 8.4 A 
resolutions. These aligned 333,214 subtracted 


6 A resolution (contour level, 3.0 o). 


to perform local CTF refinement and local re- 
finement. The final resolutions for the CR 
protomer and the core region were 6.9 A and 
6.7 A, respectively. All reported resolutions 
were estimated based on the gold-standard 
FSC = 0.143 criterion (fig. S2). All final maps 
were corrected and sharpened by applying a 
negative B factor using automated proce- 
dures in RELION 3.1. Local resolution var- 
iations of cryo-EM maps were estimated using 
Phenix. 


Prediction of NPC subunit structures 
by AlphaFold 


The AlphaFold structures in this study were 
mainly generated from the AlphaFold2 imple- 
mentation in the ColabFold notebooks (49) 


particles were also imported into cryoSPARC 
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running on Google Colaboratory (21, 22), using 
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the default settings with Amber relaxation 
(msa_method=mmseqs2, homooligomer=1, 
pair_mode=unpaired, max_msa=512:1024, 
subsample_msa=True, num_relax=5, use_ 
turbo=True, use_ptm=True, rank_by=pLDDT, 
num_models=5, num_samples=1, num_ensemble= 
1, max_recycles=3, tol=0, is_training=False, 
use_templates=False). The major difference 
of ColabFold from the native AlphaFold2 im- 
plementation is that ColabFold uses mmseqs2 
(65), which the ColabFold authors suggest give 
equivalent results (22). For complex predic- 
tion, sequences were entered in tandem and 
separated by a semicolon. For coiled coil pre- 
diction, we used homooligomer=6. Due to 
computing memory constraints on Google Co- 
laboratory, we sometimes split up large pro- 
teins at disordered junctions to predict each 
segment separately. 

AlphaFold was run once with each of the 5 
trained models; the five models generated were 
checked for consistency, and unless specified 
otherwise, the top-ranked model was taken in 
each case for density fitting. AlphaFold com- 
putes pLDDT score and pTM score to indi- 
cate the accuracy of a prediction (23). We used 
pLDDT for ranking single protein models and 
pTM for ranking protein-protein complexes, 
as recommended by ColabFold (22). A pre- 
dicted alignment error map between pairs of 
residues was also calculated for each predic- 
tion, which represents confidence in domain 
positioning. Confidence metrics (global and 
per-residue pLDDT, pTM, and PAE maps) of 
predictions made in this work can be found 
in tables S2 to S4. A few larger proteins or 
complexes (more than 1400 residues in total 
length) were run on a Boston Children’s Hos- 
pital GPU cluster, by using default AlphaFold 
settings. 

To color ribbon diagrams based on per-residue 
pLDDT scores (range 0 to 100, with higher 
being better), these scores stored at the B-factor 
column of the .pdb files were changed to 100- 
pLDDT; thus, when colored as pseudo-B-factors 
in Pymol (66), a light spectrum from blue to red 
corresponds to highest to lowest pLDDT scores. 


Model fitting and building 


Prior to beginning modeling, we used AlphaFold 
(21, 22) to generate all models of known com- 
ponents of the CR using the specific X. laevis 
sequences. An initial model of the Y-complex 
(PDB ID: 6LK8) (74) was fitted into the cryo-EM 
density using ChimeraX (67), and used as a 
reference for manual positioning of AlphaFold- 
generated subunit or complex structures into 
the density followed by executing the “fit in 
map” command to refine the replacement. Flex- 
ible loops were removed to avoid steric clash. 
After building the two Y-complexes, we began 
to model the other densities. Nup205 cryo-EM 
density was easily recognized behind the Y- 
complexes due to the large size and overall 
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shape. Inner and outer Nup205 assume a differ- 
ent position due to the presence of the Nup214- 
Nup88-Nup62 complex in the inner Y-complex. 
Nup358 density was easily recognized in the 
presence of the generated AlphaFold model 
with a prominent S shape, and allowed for 
identification of 5 copies for each CR proto- 
mer. Nup88 density was recognized due to the 
B-propeller and the long a-helix. The addi- 
tional density which belongs to the Nup214 
B-propeller was recognized upon generation 
of its AlphaFold model. Building of the Nup88- 
Nup214-Nup62 complex was assisted by predict- 
ing the hetero-trimeric coiled coil stricture in 
AlphaFold, from which a composite model of 
the Nup88-Nup214-Nup62 complex was ob- 
tained. The final model was compared with 
the previous atomic model (PDB ID: 6LK8) 
(14). The model fitting quality was estimated 
for each subunit by the correlation coefficient 
in ChimeraX (67) and in Phenix (68). A value 
of correlation coefficient ranges from -1 to 1, 
with 1 as the perfect fit, and 0.5 to 1.0 as good 
fit. This modeling process using AlphaFold is 
reminiscent of the use of stereochemical in- 
formation of amino acids and nucleic acids in 
the current practice of structural modeling 
(53) that increases model accuracy. 


Nup358 expression and purification 


X. laevis Nup358 constructs (residues 1-800 
and 1-900) were cloned into pET2Ila with a 
C-terminal His tag. Expression was carried 
out in E.coli BL21 DE3. Briefly, cells were 
grown in terrific broth media, supplemented 
with 100 pg/ml of Ampicillin and 30 pg/ml of 
Chloramphenicol, until ODgo9 reached 0.6. 
Cells were then transferred at 4°C for 30 min 
before the addition of 1 mM IPTG and in- 
cubation overnight at 18°C. Cells were pelleted 
at 3,000 g for 20 min and resuspended in lysis 
buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 
1mM TCEP, 10 mM Imidazole) supplemented 
with a protease inhibitor cocktail. Lysis was 
performed by sonication and the soluble frac- 
tion was separated by centrifugation at 40,000 g 
for 1 hour at 4°C. The supernatant was incu- 
bated with Ni-NTA beads pre-equilibrated with 
lysis buffer, and purification was performed per 
manufacturer’s recommendation. Eluted frac- 
tions were further separated by gel filtration 
chromatography with a Superdex 200 Increase 
10/300 GL in gel filtration buffer (20 mM 
Hepes pH 7.4, 150 mM NaCl, 0.5 mM TCEP). 
Fractions were analyzed by Western blotting 
using an Anti-His antibody (Takara 631210). 
The Superdex 200 Increase 10/300 GL column 
was previously calibrated in gel filtration 
buffer using a high molecular weight kit from 
MW of 43 kDa to 669 kDa (Cytiva 28-4038-42). 
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INTRODUCTION: The eukaryotic nucleus pro- 
tects the genome and is enclosed by the two 
membranes of the nuclear envelope. Nuclear 
pore complexes (NPCs) perforate the nuclear 
envelope to facilitate nucleocytoplasmic trans- 
port. With a molecular weight of ~120 MDa, 
the human NPC is one of the largest protein 
complexes. Its ~1000 proteins are taken in 
multiple copies from a set of about 30 distinct 
nucleoporins (NUPs). They can be roughly 
categorized into two classes. Scaffold NUPs 
contain folded domains and form a cylindrical 
scaffold architecture around a central channel. 
Intrinsically disordered NUPs line the scaffold 
and extend into the central channel, where 
they interact with cargo complexes. The NPC 
architecture is highly dynamic. It responds to 
changes in nuclear envelope tension with con- 
formational breathing that manifests in dila- 
tion and constriction movements. Elucidating 
the scaffold architecture, ultimately at atomic 
resolution, will be important for gaining a 
more precise understanding of NPC function 
and dynamics but imposes a substantial chal- 
lenge for structural biologists. 


RATIONALE: Considerable progress has been 
made toward this goal by a joint effort in the 
field. A synergistic combination of comple- 
mentary approaches has turned out to be 
critical. In situ structural biology techniques 
were used to reveal the overall layout of the 
NPC scaffold that defines the spatial refer- 
ence for molecular modeling. High-resolution 
structures of many NUPs were determined 
in vitro. Proteomic analysis and extensive bio- 
chemical work unraveled the interaction net- 
work of NUPs. Integrative modeling has been 
used to combine the different types of data, 
resulting in a rough outline of the NPC scaf- 
fold. Previous structural models of the human 
NPC, however, were patchy and limited in ac- 
curacy owing to several challenges: (i) Many of 
the high-resolution structures of individual 
NUPs have been solved from distantly related 
species and, consequently, do not comprehen- 
sively cover their human counterparts. (ii) 
The scaffold is interconnected by a set of 
intrinsically disordered linker NUPs that are 
not straightforwardly accessible to common 
structural biology techniques. (iii) The NPC 


@NUP214 complex ™ NUP358 @ Y-complexes e Inner ring @NUP210 @NUP155 and transmembrane hub 


A 70-MDa model of the human nuclear pore complex scaffold architecture. The structural model of the 
human NPC scaffold is shown for the constricted state as a cut-away view. High-resolution models are color coded 
according to nucleoporin subcomplex membership. The nuclear envelope is shown as a gray surface. 


Mosalaganti et al., Science 376, 1176 (2022) 


10 June 2022 


scaffold intimately embraces the fused inner 
and outer nuclear membranes in a distinctive 
topology and cannot be studied in isolation. 
(iv) The conformational dynamics of scaffold 
NUPs limits the resolution achievable in struc- 
ture determination. 


RESULTS: In this study, we used artificial in- 
telligence (AI)-based prediction to generate an 
extensive repertoire of structural models of 
human NUPs and their subcomplexes. The 
resulting models cover various domains and 
interfaces that so far remained structurally 
uncharacterized. Benchmarking against pre- 
vious and unpublished x-ray and cryo-electron 
microscopy structures revealed unprecedented 
accuracy. We obtained well-resolved cryo- 
electron tomographic maps of both the con- 
stricted and dilated conformational states 
of the human NPC. Using integrative model- 
ing, we fitted the structural models of individ- 
ual NUPs into the cryo-electron microscopy 
maps. We explicitly included several linker 
NUPs and traced their trajectory through 
the NPC scaffold. We elucidated in great de- 
tail how membrane-associated and trans- 
membrane NUPs are distributed across the 
fusion topology of both nuclear membranes. 
The resulting architectural model increases 
the structural coverage of the human NPC 
scaffold by about twofold. We extensively val- 
idated our model against both earlier and 
new experimental data. The completeness 
of our model has enabled microsecond-long 
coarse-grained molecular dynamics simu- 
lations of the NPC scaffold within an expli- 
cit membrane environment and solvent. 
These simulations reveal that the NPC scaf- 
fold prevents the constriction of the other- 
wise stable double-membrane fusion pore 
to small diameters in the absence of mem- 
brane tension. 


CONCLUSION: Our 70-MDa atomically resolved 
model covers >90% of the human NPC scaf- 
fold. It captures conformational changes that 
occur during dilation and constriction. It 
also reveals the precise anchoring sites for 
intrinsically disordered NUPs, the identi- 
fication of which is a prerequisite for a com- 
plete and dynamic model of the NPC. Our 
study exemplifies how AI-based structure pre- 
diction may accelerate the elucidation of sub- 
cellular architecture at atomic resolution. 
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Nuclear pore complexes (NPCs) mediate nucleocytoplasmic transport. Their intricate 120-megadalton 
architecture remains incompletely understood. Here, we report a 70-megadalton model of the human 
NPC scaffold with explicit membrane and in multiple conformational states. We combined artificial 
intelligence (Al)-based structure prediction with in situ and in cellulo cryo—electron tomography and 
integrative modeling. We show that linker nucleoporins spatially organize the scaffold within and across 
subcomplexes to establish the higher-order structure. Microsecond-long molecular dynamics simulations 
suggest that the scaffold is not required to stabilize the inner and outer nuclear membrane fusion 

but rather widens the central pore. Our work exemplifies how Al-based modeling can be integrated with 
in situ structural biology to understand subcellular architecture across spatial organization levels. 


uclear pore complexes (NPCs) are essen- 

tial for transport between the nucleus 

and cytoplasm and are critical for many 

other cellular processes in eukaryotes 

(1-4). Analysis of the structure and 
dynamics of the NPC at high resolution has 
been a long-standing goal toward a better 
molecular understanding of NPC function. 
These investigations have proven challenging 
because of the sheer size of NPCs and their 
compositional and architectural complexity. 
With a molecular weight of ~120 MDa, NPCs 
form an extensive 120-nm-wide protein scaf- 
fold of three stacked rings: two outer rings— 
the cytoplasmic ring (CR) and the nuclear ring 
(NR)—and the inner ring (IR). Each ring com- 
prises eight spokes that surround a 40- to 
50-nm-wide transport channel (5, 6). A single 
human NPC contains ~1000 copies of ~30 dis- 
tinct nucleoporins (NUPs). These NUPs arrange 
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into multiple subcomplexes, most prominently 
the so-called Y-complex (7) arranged in a head- 
to-tail orientation within the outer rings (8). 
The assembly of individual subcomplexes into 
the higher-order structure is facilitated by an 
as yet incompletely characterized network of 
short linear motifs (SLiMs) embedded into 
flexible NUP linkers (9-13), which have been 
conceived of as a molecular glue that stab- 
ilizes the scaffold. Complicating things fur- 
ther, the assembled scaffold is embedded into 
the nuclear envelope (NE). Components of the 
NPC scaffold interact with the NE via amphi- 
pathic helices and transmembrane domains 
and are believed to stabilize the fusion of the 
inner and the outer nuclear membranes (INM 
and ONM, respectively) (74, 15). Finally, the 
FG-NUPs grafted to the scaffold form the 
permeability barrier filling the central chan- 
nel (16-18). Their intrinsically disordered 
phenylalanine-glycine (FG)-rich domains chal- 
lenge traditional structural biology methods. 
Owing to these intricacies, the current struc- 
tural models have severe shortcomings. In the 
case of human NPC, only 16 NUPs, accounting 
for ~35 MDa (30%) of the molecular weight 
of the complex, are included in the models 
(11, 19, 20). Although the repertoire of atomi- 
cally resolved structures of NUPs has grown 
tremendously (5, 6), said structures often have 
gaps in their sequence coverage, whereas 
homology models used by many studies have 
intrinsic inaccuracies. For some NUPs, no 
structures or homology models are available. 
Also, structural models put forward for other 
species are either incomplete or have limited 
precision (11, 12, 19, 21-23). Moreover, the 
NPCs from many other species have a vastly 
reduced architectural complexity, which limits 
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their usefulness for studying human biology 
(12, 22-25). The exact grafting sites for FG- 
NUPs, which are crucial for understanding 
the transport mechanism, remain elusive. How 
exactly the NPC scaffold is anchored to the 
membrane, how it responds to mechanical 
cues imposed by the nuclear envelope, and if 
and how it contributes to shaping the mem- 
brane remain unknown. Finally, the models 
are static snapshots that do not take confor- 
mational dynamics into account. 

In this study, we combined cryo-electron 
tomography (cryo-ET) analysis of the human 
NPC from isolated NEs and within intact cells 
with artificial intelligence (AI)-based struc- 
tural prediction to infer a model of >90% of 
the human NPC scaffold at unprecedented 
precision and in multiple conformations. We 
demonstrate that Al-based models of NUPs 
and their subcomplexes built using AlphaFold 
(26) and RoseTTAfold (27) are consistent with 
unreleased x-ray crystallography structures, 
cryo-electron microscopy (cryo-EM) maps, 
and complementary data. We elucidate the 
three-dimensional (3D) trajectory of linker 
NUPs, the organization of membrane-binding 
domains, and grafting sites of most FG-NUPs 
in both the constricted and dilated conformations. 


Results 
A 70-MDa model of the human NPC scaffold 


The completeness of the previous structural 
models of the human NPC was limited by the 
resolution of the available EM maps in both 
the constricted and the dilated states and by 
the lack of atomic structures for several NUPs 
(19-21). To improve the resolution of the con- 
stricted state of the NPC, we subjected nuclear 
envelopes purified from HeLa cells to cryo-ET 
analysis, as described previously (19, 21). We 
collected an approximately fivefold larger 
dataset than what we previously published 
and applied a newly developed geometrical- 
ly restrained classification procedure (see 
Materials and methods). These improvements 
resulted in EM maps with resolutions of 12, 
12.6, and 23.2 A, respectively, for the CR, IR, 
and NR (figs. S1 to S3). Next, we obtained an in 
cellulo cryo-ET map of dilated human NPCs 
in the native cellular environment within in- 
tact HeLa and human embryonic kidney 293 
(HEK293) cells subjected to cryo-focused ion 
beam (cryo-FIB) specimen thinning (fig. S1). 
The dilated in cellulo NPC exhibits an IR dia- 
meter of 54.A, compared with 42 A in the con- 
stricted state, consistent with previous work 
in U20S (28), HeLa (29), SupT1 (30), and, most 
recently, DLD-1 cells (20). In contrast to other 
species, there is no compaction along the 
nucleocytoplasmic axis during dilation (fig. $2) 
(23, 25). The quality of our in cellulo map is 
sufficient to discern the structural features 
known from the constricted state, such as a 
double head-to-tail arrangement of Y-complexes, 
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Fig. 1. Scaffold architecture of the human NPC. (A) The near-complete model of the human NPC scaffold is shown for the constricted and dilated states as cut- 
away views. High-resolution models are color coded as indicated in the color bar. The nuclear envelope is shown as a gray isosurface. (B) Same as (A), but shown from 
the cytoplasmic side for the constricted NPC. The insets show individual features of the CR and IR enlarged with secondary structures displayed as cartoons and 


superimposed with the isosurface-rendered cryo-ET m 


IR subunits, and inter-ring connectors. We 
observe an increase in the distance between 
the adjacent spokes within the IR, in agree- 
ment with previous cryo-ET maps of NPCs 
(20, 22, 23, 25). 

To generate a comprehensive set of struc- 
tural models of human NUPs, we used the 
recently published protein structure predic- 
tion software AlphaFold (26) and RoseTTA- 
fold (27). We found that most of the NUPs can 
be modeled with high confidence scores (fig. 
S5 and table S1). In addition, we validated the 
accuracy of the models by comparison to 
structures from accompanying publications 
(31, 32) of human NUP358, NUP93, NUP88, 
and NUP98 and of Nup205 and Nup188 from 
Chaetomium thermophilum (fig. S6 and table 
S1). These structures were not used as input 
for the modeling procedure. The AI-based 
models also excellently fit our EM densities, 
with significant P values and high cross- 
correlation scores (fig. S7, A to E). Further- 
more, we used single-particle cryo-EM to 
determine the experimental structure of human 
NUP155 (fig. S8, A to C) and validated the 
respective AlphaFold model. Although the model 
and the structure do not perfectly superpose 
as whole chains owing to the flexibility of the 
protein, their local tertiary structures and side- 
chain conformations are highly similar [global 
LDDT (local distance difference test) score of 
91.6] (fig. S9). Notably, the loops that were not 
resolved in the experimentally derived struc- 


ture consistently show low predicted LDDT 
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ap of the human NPC (gray). 


(pLDDT) scores (fig. S9B), further supporting 
the reliability of this metric. 

With full-length models at hand, we could 
identify the positions of NUP205 and NUP188 
within the scaffold, which had not been un- 
ambiguously determined in the previous human 
NPC (hNPC) cryo-ET maps. The AlI-predicted 
conformation of the N-terminal domain of 
NUP358 fits the observed EM density better 
than the two x-ray structures (fig. S7H). The 
NUP358 localization is in agreement with 
previous analysis (27) and a-helical densi- 
ties visible in the Xenopus EM map (33) (fig. 
$10). The full-length model of the protein 
ELYS, for which thus far only the N-terminal 
B-propeller could be placed (27), fits the EM 
map as a rigid body (fig. S7E) and confirms 
its binding site to each of the Y-complexes in 
the NR. The models of NUPs in the CR agree 
with the secondary structure observed in the 
Xenopus cryo-EM map (fig. S10). 

The capacity of Al-based structure prediction 
tools to identify and model protein interfaces 
with high accuracy has recently been demon- 
strated (34-36). We therefore attempted to 
model NUP interfaces using the ColabFold 
software, a version of AlphaFold adopted for 
modeling protein complexes (35). We found 
that ColabFold predicted several NUP sub- 
complexes with interdomain confidence scores 
that correlated with the accuracy of the models, 
while negative controls with nonspecific inter- 
actions yielded low confidence models (figs. S11 
to S14). The models of these subcomplexes not 
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only reproduced their respective, already avail- 
able x-ray structures but also agreed with 
newly resolved x-ray structures (37, 32) (tables 
S1 and $2) and exhibited physical parameters 
similar to real interfaces (table $3). Specifically, 
xray structures of C. thermophilum Nup205 
and Nup188 in complex with Nup93 as well 
as Nup93 in complex with Nup35 are con- 
sistent with the human ColabFold model (fig. 
S6 and table S2). These structures represent 
proteins in complex with the respective SLiMs 
and form relatively small interfaces. However, 
for larger subcomplexes we also obtained struc- 
tural models that convincingly fit our cryo-ET 
maps (fig. S14). For example, the structure 
predicted for the so-called central hub of the 
Y-complex was consistent with the organiza- 
tion seen in fungal x-ray structures and ex- 
plained additional density within the cryo-ET 
map specific to the human NPC. Our model 
of the Y-complex hub includes a previously 
unknown interaction between NUP96 and 
NUP160 (fig. S11). ColabFold built a model of 
the NUP62 complex that has high structural 
similarity to the fungal homolog (table S2) 
and fits the EM map with significant P values 
(fig. S14), even though no structural templates 
were used for modeling. We were also able to 
obtain a trimeric model of the small arm of 
the Y-complex comprising NUP85, SEH1, and 
NUP43. The model fits the EM map with sig- 
nificant P values, confirming the known struc- 
ture of NUP85-SEH1 interaction (table $2) and 
revealing how NUP43 interacts with NUP85 
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(fig. S14). In the case of the NUP214 complex, 
for which no structures are available, the 
ColabFold model is highly consistent with 
the rather distinctively shaped EM density 
(fig. S14). The interface between NUP214 
and NUP88 that is biochemically validated in 
(31) has also been predicted with high struc- 
tural similarity to the equivalent interface be- 
tween homologs from C. thermophilum and 
budding yeast (table $2). 

With the cryo-EM maps and the repertoire 
of structural models of individual NUPs and 
their subcomplexes, we built a nearly com- 
plete model of the human NPC scaffold (Fig. 
1A; supplementary text in the supplementary 
materials; and Materials and methods). The 
individual components are detailed in table 
S4. We used the previous model (19, 21) as a 
reference for modeling the scaffold of the 
constricted state and replaced all previously 
fitted domains with human AlphaFold and 
ColabFold models. We then added the remain- 
ing newly modeled subunits by systematic 
fitting to the EM map and refinement using 
Assembline (37) (figs. $14 and S15). In addi- 
tion to fitting the models, we added several 
disordered linkers that connect spatially 
separated domains and SLiMs within the NPC 
(figs. S16 to S18). We then built the model of 
the dilated state by fitting the constricted NPC 
model into the dilated NPC map and refin- 
ing the fits using Assembline. The resulting 
models (Fig. 1) include 25 of the ~30 human 
NUPs (fig. S19 and table S4). The protein 
regions explicitly included in the models ac- 
count for 70 MDa of the molecular weight 
of the NPC (>90% of the scaffold molecular 
weight), compared with 16 NUPs and 35 MDa 
(46% of the scaffold weight) of the previous 
model, and largely account for the EM density 
observed in the constricted and dilated states. 

This model yields new insights into the or- 
ganization of the human NPC (Fig. 1). Within 
the IR, NUP188 and NUP205 localize to the 
outer and inner subcomplexes, respectively, 
consistent with previous analysis in other spe- 
cies (12, 22, 23). Furthermore, we localized two 
copies of NUP205 in the CR and one in the NR 
(33), thus resolving previous ambiguities 
(11, 19, 21). Two previously undetected copies 
of NUP93 bridge the inner and outer Y- 
complexes in both the CR and NR, with an 
inherent C2 symmetry. This observation is 
consistent with biochemical experiments that 
initially identified interactors of NUP93 in the 
outer rings (38). The copy of NUP93 in the CR 
is located underneath the NUP358 complex, 
further corroborating a role of NUP358 in 
stabilizing the higher-order structure (27). Yet 
another copy of NUP93 that is specific to the 
CR bridges the inner Y-complexes from two 
consecutive spokes. This is consistent with 
an additional copy of NUP205 in the CR as 
compared with the NR, because NUP93 and 
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NUP205 heterodimerize through a SLiM within 
the extended N terminus of NUP93 (see next 
section; fig. S10) (31, 32). The AI-based model 
of the NUP214 subcomplex interacts with 
NUP$85, which points toward the central chan- 
nel, likely to optimally position the associated 
helicase that is crucial for mRNA export. 


Linker NUPs fulfill dedicated roles of spatial 
organization within the higher-order assembly 


Because the exact 3D trajectory of the linkers 
through the NPC scaffold was unknown, it 
remained difficult to understand their pre- 
cise structural role beyond conceptualization 
as molecular glue. In our model, AlI-based 
models of human NUP-SLiM subcomplexes 
allowed us to map the anchor points of the 
linkers to the scaffold. The AI-based models 
correctly recapitulated SLiM interactions 
known from x-ray structures but also revealed 
previously unknown human NUP-SLiM inter- 
actions. In comparison to the x-ray structures, 
the AI-based models more extensively covered 
the structured domains, thus reducing the 
length of the linkers and restricting their 
possible conformational freedom within 
our model. 

To generate a connectivity map of the NUP 
linkers (Fig. 2), we used a multistep procedure. 
First, we calculated all geometrically possible 
connections. Next, we eliminated linker com- 
binations that were too distant, caused steric 
clashes, or were combinatorially impossible. 
Finally, we used Assembline to model the re- 
maining linkers in explicit atomic representa- 
tion for both constricted and dilated states 
(Fig. 2, figs. S16 to S18, supplementary text, 
and Materials and methods). 

The resulting connectivity map (Fig. 2A) 
reveals that the NUP35 linker regions bridge 
neighboring spokes of the IR. In our model, 
the NUP35 dimer is positioned into previously 
unassigned EM density between spokes (fig. 
$14), and each of the two copies reaches out 
with its SLiMs to NUP155 and NUP93 of the 
adjacent spokes (Fig. 2A and fig. S16). The 
NUP35 dimer, which is critical during early NPC 
biogenesis (39), thus functions as an architec- 
tural organizer for the IR membrane coat in a 
horizontal direction along the membrane plane. 

In contrast, the connectivity map demon- 
strates that the linkers at the N terminus of the 
NUP93 copies that connect anchor points at 
NUP205 or NUP188, and NUP62 complex in 
the IR, cannot reach across spokes and thus 
connect subunits within a single IR subcom- 
plex inside of the same spoke (fig. S17). Thereby, 
the two outer copies bind to NUP188, while the 
two inner copies bind to NUP205. Thus, NUP93 
acts as an architectural organizer within, but not 
across, spokes. 

In the CR and NR, the linkage between 
NUP93 and NUP205 is geometrically possible 
(fig. S18). This linkage suggests the similar 
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architectural design of the respective com- 
plexes in which the NUP93 SLiM that binds 
the NUP62 complex could also facilitate linkage 
to the homologous NUP214 complex, although 
the corresponding structural information is 
still missing. The duplication of NUP205 and 
NUP93 in the CR is suggestive of yet another 
copy of the NUP214 complex that is not well 
resolved in the cryo-EM map of the constricted 
state and thus remains to be further inves- 
tigated. This analysis is consistent with the 
biochemical analysis and structural modeling in 
the accompanying publication (32). In conclu- 
sion, the individual linker NUPs specialize in 
dedicated spatial organization functions re- 
sponsible for distinct aspects of assembly and 
maintenance of the NPC scaffold architecture. 


A transmembrane interaction hub organizes the 
interface between outer and inner rings 


Several types of structural motifs associate the 
NPC scaffold with the membrane. The spatial 
distribution of the amphipathic helices and 
membrane-binding loops harbored by NUP160, 
NUP155, and NUP133 across the scaffold has 
been previously revealed (19). The analysis of 
the protein linkers in our model allowed the 
mapping of an approximate location of the 
amphipathic helices of NUP35. In addition to 
these motifs, the human NPC contains three 
transmembrane NUPs, the precise location of 
which remains unknown. 

Among the three human transmembrane 
proteins, NDC1 is the only one that is con- 
served across eukaryotes (40). NDC1 is known 
to interact with the poorly characterized 
scaffold NUP ALADIN (41, 42). We confirmed 
this interaction using proximity labeling mass 
spectrometry with BirA-tagged ALADIN and 
identified NDC1 and NUP35 as the most 
prominently enriched interactors (fig. S20). 
NDC1 is predicted to comprise six trans- 
membrane helices followed by a cytosolic 
domain containing mainly a helices, whereas 
ALADIN is predicted to have a f-propeller 
fold. Structures of both NDC1 and ALADIN, 
however, remain unknown. Using AlphaFold/ 
Colabfold, we could model the structures both 
as monomers and a heterodimeric complex 
with high-confidence scores (figs. S5 and S11). 
Systematic fitting of the heterodimeric models 
to the EM map unambiguously identified two 
locations within the IR (fig. $14). The EM 
density was not used as a restraint for model- 
ing, but it matches the structure of the model 
and is consistent with the only patches of 
density spanning the bilayer (Fig. 3, fig. S14, 
and supplementary text), therefore further 
validating the model. The two locations are 
C2-symmetrically equivalent across the nuclear 
envelope plane, thus assigning two copies of 
ALADIN and NDCI per spoke, corroborating 
experimentally determined stoichiometry 
(43). The identified locations are close to the 
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Fig. 2. The connectivity of protein linkers within the human NPC. 

(A) The NUP35 dimer interconnects adjacent spokes across different 
subcomplex species, thus facilitating cylindrical assembly of the IR in both 
constricted (top) and dilated (bottom) states. The NUP205, NUP188, NUP62 
complex, and the N terminus (amino acids 1 to 170) of NUP93 are hidden from 
view to expose NUP35. (B) The N terminus of NUP93 isostoichiometrically 


membrane-binding N-terminal domains of 
NUP155 and amphipathic helices of NUP35, 
which is also consistent with our proximity 
labeling data (fig. S11) and previous func- 
tional analysis (44). The proximity labeling 
data prominently identified Glel, which was 
previously shown to interact with NUP155 (45). 
Using AlphaFold, we predicted an interaction 
between the C terminus of NUP155 and the 
N-terminal SLiM in Glel with high confi- 
dence scores (fig. S20B). We also predicted 
an ALADIN-binding SLiM in NUP35, located 
between the amphipathic helix and NUP155- 
binding SLiM (Fig. 3 and fig. S11). We therefore 
propose that, together with NUP155, NUP35, 
ALADIN, and NDC1 form a transmembrane 
interaction hub that anchors the inner mem- 
brane coat of the IR and orients the NUP155 
connectors toward the outer rings. The central 
position of ALADIN within the NPC might 
explain functional consequences of mutations 
in ALADIN that are implicated in triple A 
syndrome (46-48) and is consistent with the 
absence of ALADIN in fungi, which lack the 
NUP155 connectors (24). 

We next examined the structure of NUP210, 
which contains a single-pass transmembrane 
helix and is the only NUP that primarily 
resides in the NE lumen. This NUP is com- 
posed of multiple immunoglobin-like domains 
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and is thought to form a ring around the NPC 
within the NE lumen (49), but the structure 
of the ring has only been modeled in fungi. 
We used RoseTTAfold to model full-length 
NUP210 and obtained an elongated model 
with clearly defined interfaces between con- 
secutive domains. This model fitted the den- 
sity sufficiently well to allow tracing NUP210 
monomers in the cryo-EM map of the Xenopus 
laevis NPC, which is superbly resolved in the 
luminal region (50). Thus, we could assign 
eight copies of NUP210 per spoke (fig. S21). 
Modeling of individual NUP210 fragments and 
inter-NUP210 interactions with AlphaFold/ 
ColabFold (fig. S21 and table S1) led to a com- 
posite model that explained the entire density 
of the luminal ring of the human and Xenopus 
cryo-EM maps, including the C-terminal trans- 
membrane helix. The helix is long enough to 
span the NE and reaches the IR in the vi- 
cinity of the NDC1/ALADIN/NUP155/NUP35 
transmembrane interaction hub. This loca- 
tion is consistent with known interactions of 
NUP210 homologs (57) and our proximity 
labeling data (fig. $20). The model of the 
NUP210 ring also matches the luminal den- 
sity visible in our in cellulo EM map, allowing 
us to model the ring in the context of both 
constricted and dilated NPC (Fig. 1 and fig. 
$21). To further confirm the NUP210 assign- 
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connects the subunits NUP205, NUP188, NUP62, NUP58, and NUP54 within the 
same subcomplex species. Insets show NUP93 connectivity, highlighting its 
interaction with two copies of NUP205 in the CR, two copies each with NUP205 
and NUP188 in the IR, and a single copy of NUP205 in the NR. The respective 
subunits are color coded as in Fig. 1, while all other subunits and the nuclear 
membranes are shown in gray. 


ment in human cells, we deleted NUP210 in 
HEK293 cells using CRISPR-Cas9 and ana- 
lyzed the structure of the NPCs in cellulo using 
cryo-ET. The resulting map indeed showed a 
lack of the luminal ring density (fig. S4). The 
NPC scaffold in the resulting map appears un- 
changed overall, including its diameter, sug- 
gesting the NUP210 is not required for faithful 
NPC assembly. 

Our model includes all known membrane- 
binding domains except for the cell type- 
specifically expressed POM121—the precise 
location of which remains unknown within the 
NPC, and neither AlphaFold nor RoseTTAfold 
could build structural models with high confi- 
dence. The resulting membrane association map 
reveals that the membrane-binding f-propellers 
of the Y-complex (NUP160 and NUP133) and 
the IR (NUP155) are distributed as multiple 
pairs over the entire scaffold, whereby they 
follow a well-defined pattern. They form an 
overall Z-shaped outline within an individual 
spoke (Fig. 3). The NDC1/ALADIN/NUP155/ 
NUP35 membrane-binding hub is situated 
at the interface of the IR with the outer rings 
and is distinct from the additional NUP155 pair 
at the NE symmetry plane. The membrane- 
binding motifs arrange in similar clusters in 
both the constricted and dilated state. Their 
relative arrangement does not change uniformly 
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Fig. 3. The membrane-anchoring motifs of the human NPC are distributed over the entire scaffold. The membrane-binding B-propellers of the Y-complex and 

IR complex are shown color coded and arranged as pairs of the respective inner and outer copies. ALADIN and NDC1 form a transmembrane interaction hub with the inner and 
connector copies of NUP155, which is shown enlarged in the inset in the cut-away side view (right). The nuclear membranes are shown as a gray isosurface. AH, amphipathic o 
helix; L, loop; TM, transmembrane; SLiMs, short linear motifs. 


during dilation, rather the relative distances 
within the spokes remain constant while the 
spacing of the spokes increases (Fig. 3). 


The NPC scaffold prevents membrane 
constriction in the absence of 
membrane tension 


It has been a long-standing view that the 
scaffold architecture of the NPC has evolved 
distinct membrane-binding motifs to stabi- 
lize the membrane at the fusion of the INM 
and ONM (/4, 52). To test the contribution of 
the scaffold architecture to membrane curva- 
ture, we used molecular dynamics (MD) simu- 
lations using a coarse-grained Martini force 
field (53, 54). We first simulated a double- 
membrane pore without proteins with an 
initial pore diameter and membrane spacing 
as seen in the constricted NPC cryo-ET map. 
We found that the pore constricts during 1-us 
simulations and stabilizes once the radii of the 
INM or ONM and the NE hole are the same, 
such that the mean curvature nearly vanishes 
(Fig. 4A, fig. S23, and Movie 1). This is in line 
with Helfrich membrane elastic theory, which 
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predicts a catenoid-like pore shape with equal 
radii of curvature at the pore center as the 
lowest energy structure, and an energetic cost 
of ~500 kJ/mol to widen the pore (supplemen- 
tary text). Notably, the opening of the relaxed 
double-membrane pore is considerably smaller 
than even the most constricted NPC confor- 
mation. The NPC scaffold thus keeps the pore 
wider than it would be without the scaffold. 

These findings would predict that the 
nuclear membranes push against the NPC 
scaffold even in the most constricted state, 
which is in agreement with experimental 
data (23). To examine the effects of this tension 
on the NPC, we generated an NPC scaffold 
model with explicit membrane and water as 
a solvent and ran 1-us MD simulations (Fig. 4B, 
figs. S23 to S27, and Movies 2 and 3). We 
omitted the luminal NUP210 and focused our 
analysis on the architecturally important NR, 
CR, and IR. In these simulations, we found that 
the membrane pore wrapped tightly around the 
IR plane, adopting an octagonal shape (Fig. 4C). 
Similarly tight wrapping and octagonal shapes 
have been seen in the previous EM analyses of 
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NPCs (21, 55, 56). We also observed that the 
diameter of the NPC scaffold constricted by 
~9% (Fig. 4B). We attribute this tightening 
primarily to mechanical tension in the pore 
widened beyond the catenoid shape. First, we 
observed similar contraction in simulations 
with rescaled protein-protein interactions (fig. 
$27). Second, by applying lateral tension on 
the double membrane, we could maintain the 
pore width or widen it (Fig. 4B and fig. S23B). 
At even higher tension, the membrane spon- 
taneously detached from the NPC scaffold 
(fig. S28). Taken together, our data support a 
model in which the role of the NPC scaffold is 
not to stabilize the membrane fusion per se but 
rather to widen the diameter of the membrane 
hole without necessitating a wider envelope. 


Discussion 


We have built a '70-MDa model of the human 
NPC scaffold in the constricted state (smaller 
diameter) as adopted in purified nuclear en- 
velopes and in the dilated state as adopted in 
cells, whereby recent work in fungi has iden- 
tified constricted NPCs inside of cells under 
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Fig. 4. Dynamics of the NPC from molecular simulations. (A) An isolated half-toroidal double-membrane 
pore shaped initially as in the tomographic structure of the constricted NPC (left) tightens over the course 
of 1 us of MD (right) toward the catenoid-like shape (green) predicted by membrane elastic theory. 

Shown are cuts along the axis of the double-membrane pore with lipid headgroups and tails in gold and 
gray, respectively. The solvent is not shown. (B) The NPC (cyan) widens by ~10% in response to lateral 
membrane tension (right; AP = 2 bar) compared with a zero-tension simulation (left; AP = 0). Shown are 
snapshots of the relaxed structures after 1 us of MD. (©) The membrane fits tightly around the NPC inner ring 
(cyan, left; AP = 0) and forms an octagonally shaped pore (right, NPC not shown). 


specific physiological conditions (23). Our model 
includes multiple previously unassigned do- 
mains and proteins, resolves long-standing 
ambiguities in alternative NUP assignments, 
lays out a connectivity map of the protein 
linkers across the NPC scaffold, maps out the 
membrane-anchoring motifs, and provides a 
high-quality basis for further investigations 
of NPC dynamics and function. Our analysis 
demonstrates that our model is sufficiently 
complete for molecular simulations, which 
in the future could quantitively and predic- 
tively describe how the NPC interplays with 
the nuclear membrane and how it responds 
to mechanical challenges. The model also pro- 
vides a more accurate starting point for sim- 
ulations of nucleocytoplasmic transport by 
providing the native constraints on the diam- 
eter and a more precise mapping of the posi- 
tions where the FG tails emanate from the 
scaffold (fig. $22). 

How an intricate structure consisting of 
~1000 components can be faithfully assembled 
in the crowded cellular environment is a very 
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intriguing question. Our connectivity map 
captures the 3D trajectory of linker NUPs 
through the assembled scaffold. Taken together 
with previous analysis of NPC assembly 
(9-13, 19), it suggests that the linker NUPs 
facilitate dedicated spatial organization functions. 
The connections of NUP93 within individual 
IR complexes and to the NUP214-complex 
suggest a role in ensuring isostoichiometric 
assembly. This finding is consistent with the 
recent analysis of early NPC biogenesis, sug- 
gesting that NUP93 associates isostoichio- 
metrically with the NUP62-complex already 
during translation in the cytosol (57). Thus, the 
stoichiometric assembly of the NUP62 subcom- 
plex together with NUP205/188-NUP93 hetero- 
dimer is likely preassembled away from sites of 
NPC biogenesis, explaining the importance of 
the linker for intra-subcomplex interactions. 
How the spokes form a C2 symmetric interface 
at the NE plane remains to be addressed. 

In the IR membrane coat, multiple inter- 
actions converge into a distinctive trans- 
membrane interaction hub. We propose that its 
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core is formed by the ALADIN-NDC1 hetero- 
dimer at the interface between the outer and 
inner rings. This transmembrane interaction 
hub is likely a spatial organizer for two 
proximate copies of NUP155 within the same 
spoke that point toward the outer rings and 
IR, respectively. ALADIN-NDC1 likely further 
associates with NUP210, which arches between 
spokes in the NE lumen. The hub also binds 
NUP35, which connects to NUP155 copies of 
neighboring spokes, thus facilitating the hor- 
izontal, cylindrical oligomerization. Because 
NUP35 associates with NUP155 early during 
NPC assembly (39), its dimerizing domain 
appears critical to scaffold its flexible linkers 
toward neighboring spokes within the IR mem- 
brane coat. 

The often-emphasized notion that NPCs 
fuse the INM and ONM or that they stabilize 
the fusion of the INM and ONM is not neces- 
sarily supported by our analysis. Our simu- 
lations suggest that the membrane fusion 
topology per se is stable under certain con- 
ditions, relaxing toward a catenoid shape with 
zero membrane bending energy. Indeed, some 
species maintain the fusion topology in the 
absence of NPCs, for example, during semi- 
closed mitosis in Drosophila melanogaster 
(58). Our analysis instead suggests that NPCs 
stabilize a pore that is wider than in the re- 
laxed, tensionless double-membrane hole. This 
notion agrees with the ultrastructural analysis 
of postmitotic NPC assembly, which has re- 
vealed that NE holes are formed at small 
diameters and dilate once NPC subcomplexes 
are recruited (59). These data argue that the 
membrane shape defines the outline of the 
NPC scaffold and not vice versa. 

We use Al-based structure prediction programs 
AlphaFold and RoseTTAfold to model all atomic 
structures that were used for fitting to the EM 
maps. Although x-ray and cryo-EM structures 
were used for validation, no experimental atomic 
structures were directly incorporated into the 
model. Predicted atomic structures tradition- 
ally exhibited various inaccuracies, limiting their 
usage for detailed near-atomic model building 
in low-resolution EM maps. However, Alpha- 
Fold and RoseTTAfold have recently demon- 
strated unprecedented accuracy in predicting 
structures of monomeric proteins (26, 27, GO-65) 
and complexes (34, 36, 61, 66). They accurately 
assess their confidence at the level of individual 
residues and interdomain contacts (26, 27, 67). 
Indeed, we could successfully validate our models 
by comparing them to unpublished crystal 
structures, cryo-EM maps, and biochemical 
data. The resulting model of the NPC scaffold 
is almost complete and exhibits near-atomic- 
level precision at several interfaces. The model 
also contains several peripheral NUPs, for 
example, parts of the NUP214 and NUP358 
complexes. Projection of the locally estimated 
accuracy into an asymmetric unit of the NPC 
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Movie 1. MD simulation of a half-toroidal double 
membrane. MD simulation trajectory of an isolated 
half-toroidal double membrane shaped initially as in 
the tomographic structure of the constricted NPC. 
The pore tightens within 1.2 us of MD (see also Fig. 4B 
and fig. S23A for the diameter time trace). A top 
view of lipids is shown. Solvent is omitted for clarity. 


reveals that the structured regions are gener- 
ally modeled with good confidence, while 
linkers and peripheral loops are less well de- 
fined (fig. S29). Although the entire EM den- 
sity for those peripheral NUPs is unlikely to be 
resolved in the near future owing to their 
flexibility, the complete model of the human 
NPC could be within reach by integrating data 
from complementary techniques that can address 
flexible proteins, such as super-resolution micros- 
copy, fluorescence resonance energy transfer, 
and site-specific labeling (78). 

Thanks to in situ and in cellulo cryo-ET 
and powerful Al-based prediction (26, 27), 
intricate structures such as the NPC can now 
be modeled. Not all subunit or domain com- 
binations that we attempted to model with 
Al-based structure prediction led to structural 
models that were consistent with comple- 
mentary data, emphasizing that experimental 
structure determination will still be required 
in the future for cases in which a priori knowledge 
remains sparse. However, even if Al-based 
modeling does not yield high-confidence re- 
sults, the models can still serve as tools for 
hypothesis generation and subsequent exper- 
imental validation. 


Materials and methods 
Mammalian cell cultivation and 
subcellular fractionation 


Modified human embryonic kidney cells 293 
(HEK Flp-In T-Rex) 293 Cell Line, Life Tech- 
nologies) designed for rapid generation of stably 
transfected cell lines with a tetracycline- 
inducible expression system were used as 
parental cells. The NUP210 CRISPR-knockout 
line (HEK NUP210A) has been previously de- 
scribed (67). In general, all cells were main- 
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Movie 2. MD simulation of an NPC with explicit 
membrane (a = 1.0). MD simulation of the NPC 
(cyan) covering ~1.2 us with a = 1.0 (see also 

figs. S23 and S27 for the diameter time trace). A top 
view of the NPC with membrane is shown. Solvent 
is omitted for clarity. 


tained in Dulbecco’s modified Eagle medium 
(DMEM) supplemented with 5 g/liter glucose 
and 10% heat-inactivated fetal bovine serum 
(FBS, Sigma-Aldrich). HeLa Kyoto cell line was 
maintained in DMEM medium containing 
1 g/liter glucose supplemented with 2 mM 
L-glutamine. Cells grown close to confluency 
(~90%) were trypsinized with 0.25% trypsin 
containing EDTA (Life Technologies) and pas- 
saged for further growth. For the preparation 
of nuclear envelopes, HeLa cells were cultured 
and subjected to subcellular fraction as de- 
scribed before (8, 43). 


Grid preparation 

Grids with HeLa nuclear envelopes were 
prepared exactly as described in (2/7). For 
the in cellulo work, Au200 R2/1 SiO, grids 
(Quantifoil Micro Tools GmbH) were glow 
discharged on both sides and sterilized under 
ultraviolet light. In a six-well cell culture dish, 
either 250,000 cells per well (for HeLa) or 
400,000 cells per well (HEK293) were pipetted 
onto the grids prewetted with DMEM medium. 
Cells were left to settle and attach to the grids for 
4 hours at 37°C in 5% COs. Subsequently, the 
HeLa or HEK293 grids were plunge-frozen with 
a Leica EM GP plunger with set chamber envi- 
ronment to 99% humidity and 37°C. Grids were 
blotted from the backside for 2 s and plunged 
in liquid ethane-propane mix (37 and 63%) 
at about -195°C. HEK NUP210A grids were 
washed once with phosphate-buffered saline 
containing 8% dextran (35.45 kDa) and blotted 
for 3 s before plunge freezing in —186°C liquid 
ethane. 


Cryo-FIB milling and data acquisition 


Plunge-frozen sample grids were FIB-milled on 
an Aquilos FIB-SEM (Thermo Fisher Scientific) 
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Movie 3. MD simulation of an NPC with explicit 
membrane (a = 0.7). MD simulation of the NPC 
(cyan) covering ~1.2 us with a = 0.7 (see also Fig. 4B 
and fig. S23A for the diameter time trace). A top view 
of the NPC with membrane is shown. Solvent is 
omitted for clarity. 


as described before (22, 23). In brief, samples 
were coated with inorganic platinum (Pt- 
sputtering). Subsequently, a protective layer 
of organometallic platinum was deposited for 
~20 s using the gas injection system. Cells 
were then stepwise milled at a 20° angle toa 
final thickness of ~200 to 250 nm using de- 
creasing ion-beam currents of 1 nA to 50 pA. 
A final round of Pt-sputtering was applied 
before unloading the sample. 


Cryo-electron tomography and subtomogram 
averaging of the human NPC from 
nuclear envelopes 


Tilt series were collected with SerialEM as 
described by Kosinski e¢ al. (19). The angular 
coverage of the tilts spanned from —60° to 
+60° Ten 8K x 8K frames, per tilt, were col- 
lected in the super-resolution mode on a K2 
direct electron detector (Gatan Inc.) equipped 
with an BioQuantum Imaging Filter (GIF). An 
average total dose of 120 e /A? per tomogram 
was used. Five hundred sixteen new tilt series 
were collected and combined with 101 tilt 
series reported by Kosinski et al. (19) leading 
to a total of 617 tilt series. Incomplete tilt 
series (missing more than seven tilts in either 
direction or terminated owing to autofocusing 
error because of the edge of the grid bar) were 
discarded. Contrast transfer function (CTF) 
was determined using CTFFind4 (68). Tilt 
series with large discrepancies in the two de- 
focus values estimated by CTFFind4 were also 
removed. This resulted in a total of 554 tilt series. 
Tilt series were manually aligned by tracking 
gold fiducials in IMOD (69). Tilt series were fil- 
tered according to accumulated exposure on the 
basis of parameters described by Kosinski et al. 
(19). Tilt series were reconstructed with 3D 
CTF correction using NovaCTF (70). 


7 of 13 


RESEARCH | 


STRUCTURE OF THE NUCLEAR PORE 


Subtomograms (7711) containing individual 
NPCs were extracted, corresponding to 61,688 
asymmetric units. The pixel size at the speci- 
men level was 3.37 A. Tomograms were binned 
by Fourier cropping 2x (bin2), 4x (bin4), and 
8x (bin8), and subtomograms were extracted 
at each level of binning corresponding to a 
pixel size of 6.74, 13.48, and 26.96, respectively. 
Subtomogram averaging was performed on 
a whole-pore level with the bin8 dataset. Sub- 
sequently, asymmetric units were extracted 
from the aligned pores and averaged as de- 
scribed by Kosinski et al. (19). Subunits with 
the center outside of tomogram boundaries 
were excluded from further processing. The 
CR, IR, and NR were processed separately, 
that is, the positions of subunit centers were 
moved to be in the center of each ring, result- 
ing in three different sets of subtomograms. 
Rings with the center outside of tomogram 
boundaries were excluded from further pro- 
cessing. The subtomograms were iteratively 
aligned first on bin4 and then on bin2 level, 
and the final alignment was refined on bin1 
level. The complete subtomogram averaging 
and alignment was performed using novaSTA 
(71), the masks necessary for the alignment 
were created in Dynamo (72) and Relion (73). 

After the bin4 alignment, the quality of each 
subtomogram was assessed using geometric- 
ally restrained classification that is based on 
the expected geometrical shape of a complete 
ring. More precisely, the subunits correspond- 
ing to one NPC ring should still be part of a 
ring after the alignment. For each ring and 
each subunit, the angular distance of its nor- 
mal vector to all other normal vectors of sub- 
units within the same ring was computed. 
Subsequently, the distances were averaged (for 
each NPC separately), and for each subunit, the 
deviation from the normal vector from the 
average was computed. The same was done 
for so-called in-plane vectors, that is, vectors 
describing the direction from the NPC center 
to the center of a subunit. The expected/ideal 
angular distance for the normal vectors is zero, 
while for the in-plane vectors it is 45°. This 
analysis was performed only on the rings with 
at least three subunits that were retained from 
the initial subtomogram averaging runs. The 
rings with fewer than three subunits were 
removed from further processing. The com- 
puted deviations were used to identify poorly 
aligned ring subunits in order to remove them. 
For CR and SR, all subunits with the normal 
vector deviation >30° were removed. The de- 
viation of in-plane angles from expected 45° 
greater than 15° or CR and 10° for SR was used 
as threshold for additional removal of ring 
subunits. The threshold values were deter- 
mined empirically. For CR and SR, the number 
of subunits left for processing after geometrical 
cleaning were 31,774 and 35,281, respectively. 
The geometrical-restrain classification was added 
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to the publicly available novaSTA package (77). 
The final subunit cleaning to remove poor-quality 
subunits was performed on the final average 
using the constrained cross-correlation (CCC) 
value, which was computed between each sub- 
tomogram and the reference during the last 
iteration of alignment. Subtomograms with the 
worst CCC values were subsequently removed 
in batches of 1000, as long as the resolution 
improved. The number of subunits or subtomo- 
grams contributing toward the final structure 
of the CR and SR ring were 21,604 and 30,000, 
respectively. 

In contrast to CR and IR, adding additional 
tilt series followed by geometric cleaning 
procedure on NR did not yield any significant 
improvement in comparison to the dataset 
reported by Kosinski et al. (19). Thus, the 
original map of the NR was used for the pre- 
sented analysis. The map was created using 
steps described by Kosinski et al. (19), and the 
total number of particles contributing to the 
final average was 11,112. 


Cryo-electron tomography and subtomogram 
averaging of human NPC in cellulo 


Data acquisition was performed on a Titan 
Krios G2 (for HeLa) or G4 (for HEK) (Thermo 
Fisher), operating at 300 kV and equipped 
with Gatan K2 Summit direct electron detec- 
tor and energy filter as described before (23). 
In brief, tilt series were acquired in dose- 
fractionation mode at 4k by 4k resolution 
with a nominal pixel size of 3.37 (HeLa) or 
3.45 A (HEK) using an automated dose- 
symmetric acquisition scheme (74) starting at 
a given pre-tilt corresponding to the tilt of the 
FIB-milled lamellae (typically +13°). Tilt series 
were acquired with a tilt increment of 3° and 
a tilt range interval of —50°/+50°, and a total 
dose per tomogram of 120 to 150 e /A’. 

Tilt series preprocessing and tomogram re- 
construction was performed as described pre- 
viously (22, 23). Subtomogram averaging was 
performed as described before (22, 23). In 
brief, for the HeLa control dataset, 53 NPCs 
were extracted from 13 tomograms. For the 
HEK dataset, 30 control and 43 NUP210A 
NPCs were extracted from 8 control and 14 
NUP210A tomograms, respectively. Whole pores 
were aligned using bin8 and bin4 subtomo- 
grams with imposed eightfold symmetry. Upon 
convergence, 280, 150, and 222 subunits were 
extracted from the control HeLa, control HEK, 
and HEK NUP210A datasets, respectively, and 
the CR, IR, and NR subunits were further re- 
fined independently using bin4 subtomograms. 
The individual ring subunits were refined 
without splitting the data into independent 
half sets to a final resolution of <54:A (NR) and 
<48 A (CR and IR) as estimated by Fourier 
shell correlation (FSC) using the 0.5 criterion 
for the HEK datasets. For the HeLa dataset, 
gold-standard criteria were used to calculate 
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the FSC, which resulted in final resolution of 
45 A (CR and IR) and 53 A (NR). 


Structure determination of human NUP155 


The gene encoding human NUP155 (UniProt 
ID: 075694) was synthesized by GeneArt (Life 
Technologies) and cloned into a modified 
pFastBac vector, with a His6 tag and an 
enhanced green fluorescent protein tag fol- 
lowed by an HRV 3C protease site at the N 
terminus and a Strep tag at the C terminus. 
The predicted membrane-binding loop (residues 
260 to 273) was deleted to improve the protein 
stability. The resulting construct was expressed 
in Sf21 insect cells using the Bac-to-Bac baculo- 
virus expression system (Thermo Fisher Sci- 
entific). Sf21 cells were cultured in Sf900III 
medium (Gibco) at 27°C and infected at a 
density of 1 x 10° to 2 x 10° cells ml’. After 
48 hours of incubation, cells were collected 
by centrifugation (3000g, 10 min, 27°C), and the 
pellets were stored at —-80°C until purification. 

For purification, frozen cell pellet from 
100 ml culture was resuspended in 10 ml of 
buffer containing 20 mM HEPES (pH 7.5), 
100 mM NaCl, 1 mM dithiothreitol (DTT), 
and 0.1 mM phenylmethylsulfonyl fluoride 
and disrupted by sonication for 5 min using 
Branson sonifier 250. After removing cell debris 
by centrifugation (3000g, 10 min, 4°C), the 
supernatant was mixed with 500 ul of Strep- 
Tactin Sepharose resin (IBA Lifesciences) 
and incubated at 4°C for 30 min. The resin 
was washed with 8 ml of buffer containing 
20 mM HEPES (pH 7.5), 100 mM NaCl, and 
1mM DTT, and the bound sample was eluted 
with the same buffer supplemented with 5 mM 
biotin. The eluted fractions were concentrated 
with an Amicon Ultra 0.5 ml centrifugal filter 
(100 kDa molecular weight cut-off, Millipore), 
mixed with Turbo-3C protease (Sigma-Aldrich), 
and incubated at 4°C overnight. The sample was 
then ultracentrifuged ('71,680g, 15 min, 4°C) and 
loaded onto Superose 6 Increase 3.2/300 equil- 
ibrated with 20 mM HEPES (pH 7.5), 100 mM 
NaCl, and 1 mM DTT. The peak fractions were 
aliquoted and stored at —-80°C until use. 

For the preparation of EM grids, the protein 
concentration was adjusted to 0.4 mg ml? in 
20 mM HEPES (pH 7.5), 100 mM NaCl, 1 mM 
DTT, and 0.001% dodecyl maltoside. The diluted 
sample was then applied onto a freshly glow- 
discharged UltrAuFoil RO.6/1.0 gold grids 
(300 mesh, Quantifoil), blotted for 6 s at 4°C in 
100% humidity, and plunge-frozen in liquid 
ethane using Vitrobot Mark IV. Cryo-EM data 
were collected on a Titan Krios G4 microscope 
(Thermo Scientific) operated at 300 kV, equipped 
with a E-CFEG, a Falon 4 direct electron detector 
(Thermo Scientific), and a Selectris X energy filter 
(Thermo Scientific) operated with a slit width 
of 10 eV. Automated data acquisition was per- 
formed with the EPU software at a nominal 
magnification of x165,000, corresponding to a 
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pixel size of 0.730 A per pixel. Movies were 
acquired at a dose rate of 4.95 electrons per 
pixel per second, and a total dose of 50 e /A’, 
resulting in EER movies consisting of 1407 
frames. In total, 6430 movies were acquired 
with a defocus range of -1.0 to -2.0 um. 
Acquired images were first processed in 
cryoSPARC (75), and selected particles were 
further processed in RELION-3.1 using 
csparc2star.py script in UCSF pyem (76) for 
transfer of particles. To analyze the conforma- 
tional flexibility, multibody refinement (77) was 
performed on the consensus map. The details 
of data processing are summarized in fig. S8. 

The model of human NUP155 was manually 
built in Coot (78), using the crystal structure 
of Nup170, a homolog of NUP155, from C. 
thermophilum (PDB ID: 5HAX) as a starting 
model. Secondary structure prediction from 
PSIPRED (79) and multiple sequence align- 
ment were used to facilitate the model build- 
ing. The model was iteratively refined using 
phenix.real_space_refine (80). Figures were 
prepared with UCSF Chimera (81), UCSF 
ChimeraX (82), and CueMol (http://www. 
cuemol.org/). 


Proximity labeling using BiolD 


BioID analysis of ALADIN was done as previ- 
ously described (83). In brief, ALADIN was 
BirA-tagged and overexpressed in Hek293 Flp- 
In Trex cells. Quantitative mass spectrometry 
was done in four biological replicates and in 
comparison to control cells expressing BirA- 
tagged NLS-NES-Dendra that resides within 
the central channel. 


Structural modeling of NUPs and 
NPC subcomplexes 


The structures of all individual NUPs and 
selected subcomplexes were modeled using 
AlphaFold (26) or downloaded from AlphaFold 
Database (60). The models of monomeric 
proteins (NUP155, NUP133, NUP107, NUP93, 
NUP205, NUP188, NUP160, NUP358, and ELYS) 
were download from AlphaFold Database (60). 
To model subcomplexes or their parts around 
the interfaces (NUP62-NUP54-NUP58, NUP205- 
NUP93, NUP188-NUP93, NUP155-NUP35, 
NUP93-NUP35, NDCI-ALADIN, NUP35 homo- 
dimer, NUP85-SEH1-NUP43, NUP160-NUP96- 
SEC13, NUP160-NUP37, NUP133-NUP107, 
NUP96-NUP107, NUP160-NUP96-NUP85, NUP214- 
NUP62-NUP88, and NUP88-NUP98), we used 
the AlphaFold version modified for modeling 
complexes, available through ColabFold (35), 
with all parameters set to default except for the 
max_recycles parameter, which was set to be- 
tween 12 and 48, depending on the subcom- 
plex. For NUP210, we first built the initial full- 
length using RoseTTAfold (27), as AlphaFold 
did not provide a full-length model fitting 
well into the EM density map. After fitting 
the model into the EM maps as a rigid body 
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(see below), we used AlphaFold to model suc- 
cessive monomeric and homodimeric fragments 
of NUP210, superposed them onto the fitted 
RoseTTAfold model, and refined the fits. The 
quality of the AlphaFold models was first as- 
sessed by the scores provided by the authors— 
the predicted local distance difference test 
(pLDDT), which predicts the local accuracy, 
and predicted aligned error, which assesses 
the packing between domains and protein 
chains. In addition, we validated the models 
by comparing to structures not used for mod- 
eling, structures published in the accompany- 
ing paper (32), fits to the cryo-ET maps, and 
previously published biochemical data (figs. 
S5 to $7, S10, S11, and S14). 


Systematic fitting of atomic structures to 
cryo-ET maps 


We used the previously published procedure 
for systematic fitting (8, 19, 21, 22, 25, 37, 84) 
to both locate the atomic structures in the 
cryo-ET maps and validate the AlphaFold 
models. Before fitting, all the high-resolution 
structures were filtered to between 10 and 15 A. 
The resulting simulated model maps were sub- 
sequently fitted into individual ring segments of 
cryo-ET maps by global fitting as implemented 
in UCSF Chimera (82) using scripts in Assemb- 
line (37). The maps used for fitting excluded 
nuclear envelope density in order to eliminate 
the possibility of fits overlapping with the 
membrane. All fitting runs were performed 
using 100,000 random initial placements, with 
the requirement of at least 30 to 60% (depend- 
ing on the size of the structure) of the 
simulated model map to be covered by the 
cryo-ET density envelope defined at a low 
threshold. For each fitted model, this pro- 
cedure resulted in ~1000 to 20,000 fits with 
nonredundant conformations upon clustering. 
The cross-correlation about the mean (cam 
score, equivalent to Pearson correlation) score 
from UCSF Chimera (87) was used as a fitting 
metric for each atomic structure, similarly to 
our previously published works. The statistical 
significance of every fitted model was evaluated 
as a P value derived from the cam scores. The 
calculation of P values was performed by first 
transforming the cross-correlation scores to 
g-scores (Fisher’s z-transform) and centering, 
from which subsequently two-sided P values 
were computed using standard deviation 
derived from an empirical null distribution 
[based on all obtained nonredundant fits and 
fitted using fdrtool (85) R-package]. Finally, 
the P values were corrected for multiple testing 
with Benjamini-Hochberg procedure (86). 


Modeling of the human NPC scaffold 


To assemble the models of the entire NPC 
scaffold based on the constricted and dilated 
cryo-ET maps, we used our integrative modeling 
software Assembline (37), which is based 
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on Integrative Modeling Platform (IMP) (87) 
version 2.15 and Python Modeling Interface 
(PMI) (88). First, we built the model of the 
constricted NPC owing to its higher resolu- 
tion. The AlphaFold models of NUP domains 
and subcomplexes already present in our 
previous human NPC models (J9, 21) were 
placed in the map by superposing them onto 
the published models. The remaining domains 
and subcomplexes added in this work (NUP358, 
NUP35, NUP93 in the outer rings, NDC, 
ALADIN, and the NUP214 complex) were 
placed using systematic fitting (as above) and 
global optimization procedure of Assembline. 
In addition to using models of subcomplexes 
as rigid bodies for fitting, several inter-subunit 
interfaces were restrained by elastic distance 
network derived from ColabFold models over- 
lapping with and bridging already fitted 
models. During the refinement, the struc- 
tures were used as rigid bodies and simul- 
taneously represented at two resolutions: in 
Ca-only representation and a coarse-grained 
representation, in which each 10-residue stretch 
was converted to a bead. The 10-residue bead 
representation was used for all restraints to 
increase computational efficiency except for 
the domain connectivity restraints, for which 
the Co-only representation was used. The flex- 
ible protein linkers between the domains were 
added as chains of one-residue beads. The 
entire structure was optimized using the refine- 
ment step of Assembline to optimize the fit to 
the map, minimize steric clashes, and ensure 
connectivity of the protein linkers. The scoring 
function for the refinement comprised the EM 
fit restraint; clash score (SoftSpherePairScore 
of IMP); connectivity distance between domains 
neighboring in sequence; a term preventing 
overlap of the protein mass with the nuclear 
envelope; a restraint promoting the membrane- 
binding loops of NUP133, NUP160, and NUP155 
to interact with the envelope implemented 
using MapDistanceTransform of IMP [pre- 
dicted by similarity to known or predicted 
ALPS motifs X. Jaevis and S. cerevisiae homo- 
logs (6, 21, 89)]; and elastic network restraints 
derived from the subcomplexes modeled with 
AlphaFold/ColabFold. The final atomic struc- 
tures were generated from the refined models 
by back-mapping the coarse-grained represen- 
tation to the original AlphaFold atomic models. 
The conformation of the linkers was further 
optimized using Modeller (90) and Isolde (91). 
The stereochemistry of the final model was 
optimized using steepest descent minimiza- 
tion in GROMACS (92). 

The model of the dilated NPC was built by 
fitting the asymmetric units of the individual 
cytoplasmic, inner, and nuclear rings of the 
constricted NPC model to the dilated cryo- 
ET maps and refining the fits with Assemb- 
line. The refinement procedure was performed 
as above. 
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To calculate the percentage of the molecular 
weight of the full NPC and the NPC scaffold 
covered by the new and the old models, we 
defined the full NPC as being composed of the 
following 32 NUPs, with the stoichiometry 
indicated in the parentheses: NUP160 (32), 
NUP96 (32), NUP85 (32), SEH1 (32), SEC13 
(32), NUP107 (32), NUP133 (32), NUP358 (40), 
NUP43 (32), ELYS (16), NUP37 (32), NUP188 
(16), NUP205 (40), NUP155 (48), NUP93 (56), 
NUP35 (32), NUP62 (48), NUP54 (32), NUP58 
(32), NUP88 (16), NUP214 (16), NUP98 (48), 
NDC1 (16), NUP210 (64), and ALADIN (16), 
POM121 (32), TPR (32), NUP153 (32), NUP50 
(16), CG1 (8), DDX19 (16), and GLE1 (8). The 
scaffold NPC was defined as being composed 
of 25 NUPs: NUP160, NUP96, NUP835, SEH1, 
SEC13, NUP107, NUP133, NUP358, NUP43, 
ELYS, NUP37, NUP188, NUP205, NUP155, 
NUP93, NUP35, NUP62, NUP54, NUP58, 
NUP88, NUP214, NUP98, NDC1, NUP210, 
and ALADIN. The stoichiometry for the scaf- 
fold was the same as for the full NPC with 
exception of NUP214 complex for which only 
one copy was counted, as the second copy is 
not clearly visible in the EM density. Note 
that for some nucleoporins, like NUP98 or 
POM121, the exact stoichiometry is still un- 
certain. The coiled-coil domains of the periph- 
eral NUPs of the NUP214 complex and the 
a-solenoid domain of NUP358 were included 
in the scaffold. The FG regions were excluded. 
These definitions resulted in the molecular 
weight of 119 MDa for the full NPC and 76 MDa 
for the scaffold. The scaffold diameters were 
described by two distances between the op- 
posite spokes: the membrane-to-membrane 
distance and the distance between ferredoxin-like 
domains of NUP54 at the residue 220. Figures 
were produced using UCSF ChimeraX (82). 


Molecular dynamics (MD) simulations 


We performed MD simulations of half-toroidal 
membrane pores in isolation and including 
the hNPC scaffold. In the following section, we 
describe the setup of the simulation models, 
the relevant MD parameters, and the analysis 
of the MD trajectories. 


Membrane model 


First, a 30 nm by 30 nm coarse-grained POPC 
lipid bilayer patch was generated using 
insane.py (93, 94). The bilayer was placed in 
a periodic simulation box, solvated on both 
sides, energy minimized, and simulated for 
100 ns using standard MD parameters, as 
noted below. 

Then, half-toroidal membrane pores were 
constructed with the BUMpy software (93) 
using this initial flat bilayer as membrane 
input. The following command line flags were 
used when running bumpy: -s double_bilayer_ 
cylinder -z 10 -g 1_cylinder:10 r_cylinder:430 
r_junction:120 1|_flat:1400 (see bumpy docu- 


Mosalaganti et al., Science 376, eabm9506 (2022) 


mentation). The resulting membranes coin- 
cided reasonably with the cryo-ET density of 
the double-membrane pore and allowed us to 
place the membrane-anchoring motifs of the 
NPC model into the membrane. 

Two carbon nanotube porins (CNTPs) were 
inserted into the membrane in the corners of 
the simulation box distant from the NPC (see, 
e.g., fig. SI9A). The CNTPs with a length of 
3.6 nm and a diameter of 14.7 nm enable 
water transfer in and out of the otherwise 
disconnected luminal volume, as is required 
for membrane-mechanical equilibration. With- 
out CNTPs, the luminal volume would be 
effectively fixed and, as a result, changes in the 
membrane shape during MD simulations 
without NPC scaffold would induce artifactual 
membrane buckling. The CNTPs were built ac- 
cording to previous work (93-95). The outer- 
most carbon rings at either CNTP end consisted 
of polar SNda beads for stable membrane 
embedding (93). To stiffen the wide CNTPs, 
the improper dihedral force constant was 
increased to 1000 kJ mol rad~?. The CNTP 
parameters were otherwise set as previously 
reported (95, 96). The code to generate CNTP 
models and the parameters for simulations 
are available at: https://github.com/bio-phys/ 
cnt-martini (95). The CNTPs were embedded 
in the flat patch of the NPC membranes away 
from the NPC. Lipids within 8 A of the CNTPs 
or inside their circumference were removed. 


NPC scaffold model 


The MD simulation model of the NPC in- 
cluded the entire scaffold (see table S4 and 
fig. S19 for a summary of the hNPC simula- 
tion model) except for the disordered FG-NUP 
C- and N-terminal tails. For simplicity and to 
limit the system size, we also excluded the 
NUP210 glycoprotein in the nuclear envelope 
lumen. Otherwise, the models were complete 
as described above. 

Each protein chain was coarse-grained in- 
dividually using martinize.py as follows. All 
chain termini were uncharged and otherwise 
default protonation states were used. Secondary 
structure restraints were assigned according to 
DSSP (97). The tertiary structure of each protein 
chain was maintained by an elastic network 
using the recommended default settings with 
a cutoff R, of 0.9 nm and a force constant k of 
500 kJ mol‘ nm ®. For each protein chain, the 
ElNeDyn2.2 protein force field was used in 
conjunction with the Martini 2.2 force field 
(53, 54). Simulations were performed with the 
default protein-protein interaction (a = 1.0; 
results shown in the supplementary materials) 
and with protein-protein interactions scaled 
relative to protein-solvent interactions with 
a. = 0.7 (98) (results shown in the main text) 
to correct for the effect of reportedly over- 
estimated nonbonded interactions (99). This 
procedure used the martinize.py script and 


10 June 2022 


was wrapped in custom python code to auto- 
matically generate the structures for each 
protein chain with the aforementioned param- 
eters. To enable easier handling of the large 
number of protein chains, each protein chain 
was assigned a unique segid. Importantly, with 
this MD simulation model, all protein-protein 
interactions between distinct chains could dis- 
sociate and new interactions could form in 
principle, and the structure of linker regions 
could relax. 

All individually coarse-grained protein chains 
were then merged into one PDB structure 
file. The resulting coarse-grained NPC scaffold 
model was centered within the half-toroidal 
membrane pore model containing the CNTPs 
described above. Any lipids within 8 A of any 
bead of the scaffold proteins in the initial as- 
sembly were removed. 


Solvation 


All systems were solvated with coarse-grained 
water containing 10% anti-freeze WF particles 
and Na* ions to neutralize the system using 
standard GROMACS tools. All systems simu- 
lated in this study are listed in table S5. 


MD simulations 


All molecular dynamics simulations were per- 
formed using the GROMACS software package 
and the coarse-grained Martini force field v2.2 
(53, 92, 100). Each system was first steepest- 
descent energy minimized using a soft-core 
potential to remove steric clashes in the initial 
model. The systems were then equilibrated in 
an NPT ensemble with semisotropic pressure 
coupling first for 2.5 ns with a 5-fs timestep 
and then for 100 ns with a 15-fs timestep with 
position restraints on the protein backbone 
beads with a force constant of 1000 kJ mol * 
nm”, maintaining a temperature of 310 K 
and pressure of 1 bar using the Berendsen 
barostat and velocity rescaling thermostat 
(101, 102). Characteristic coupling times of 12 
and 1 ps were used, respectively. During produc- 
tion simulations, the Parrinello-Rahman baro- 
stat was used (103). 

The Verlet neighbor search algorithm was 
used to update the neighbor list, with the length 
and update frequency being automatically 
determined. Lennard-Jones and Coulomb forces 
were cut off at 1.1 nm, with the potential shifted 
to zero using the Verlet-shift potential modifier. 
A 15-fs timestep was used in all production 
simulations. Production simulations were per- 
formed for ~1.2 us each. 


Membrane tension 


To apply lateral tension on the double- 
membrane structure, an anisotropic pres- 
sure tensor was used with an out-of-plane 
pressure of P, = 2AP/3 and an in-plane 
pressure of P) = p— AP/3, with p = 1 bar. 
This results in a traceless lateral strain 
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S = diag(—AP/3, —AP/3, 2AP/3) where AP = 
P, — P,.The resulting tension on the double- 
membrane system is o = (P, — Pi )L; = APL, 
with L, the box height. To allow for gradual 
equilibration under tension, AP was increased 
in steps of 1 bar until reaching the target value 
(see table S1). 


Analysis of MD simulations 


Images and movies were generated using 
VMD (104) and time series were analyzed 
using the MDAnalysis library (105). To moni- 
tor conformational changes, we calculated the 
root-mean-square distance (RMSD) from the 
starting structure using the qcprot RMSD 
alignment algorithm implemented in MDA- 
nalysis (105). The RMSD was calculated every 
1.5 ns for the backbone (BB) beads with re- 
spect to the rigid-body aligned initial struc- 
ture. In addition to the individual protein 
chains, we analyzed in this way the B-propeller 
present in the three nucleoporins NUP133 
(residues 1 to 480), NUP155 (residues 1 to 
500), and NUP160 (residues 1 to 500); as well 
as the respective alpha solenoid domain 
NUP133 (residues 500 to end), and NUP155, 
and NUP160 (residues 507 to end); and each 
of the eight spokes as a whole. In the RMSD 
analysis, averages and standard deviations 
were calculated across the eight spokes or 
across equivalent protein copies in the NPC 
scaffold, respectively. 

During the MD simulations, the diameter 
of the NPC membrane pore was determined 
by least-square fitting the center and radius 
of a circle in the xy-plane to the membrane 
center (C4A and C4B lipid beads). The fit was 
performed at the narrowest region of the 
half-toroidal membrane pore. 


Possible limitations 


We note that the time scale currently accessi- 
ble to MD simulations is too short to fully 
recapitulate the complete NPC dilation and 
constriction processes, including the large- 
scale NPC structural rearrangements. We also 
note that the elastic network on proteins of 
the Martini model restricts internal confor- 
mational changes, which might be required for 
larger-scale NPC dilation. The coarse-grained 
interaction model may also weaken some 
protein-protein interactions and strengthen 
others. Finally, we expect that the missing 
FG mesh in the MD model contributes to the 
compaction of the NPC scaffold seen in the MD 
simulations, acting on top of the mechanical 
tension in the widened double-membrane pore 
(supplementary text). 
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INTRODUCTION: In eukaryotic cells, the selec- 
tive bidirectional transport of macromolecules 
between the nucleus and cytoplasm occurs 
through the nuclear pore complex (NPC). Em- 
bedded in nuclear envelope pores, the ~110-MDa 
human NPC is an ~1200-A-wide and ~'750-A- 
tall assembly of ~1000 proteins, collectively 
termed nucleoporins. Because of the NPC's eight- 
fold rotational symmetry along the nucleo- 
cytoplasmic axis, each of the ~34 different 
nucleoporins occurs in multiples of eight. Ar- 
chitecturally, the NPC’s symmetric core is 
composed of an inner ring encircling the cen- 
tral transport channel and two outer rings 
anchored on both sides of the nuclear enve- 
lope. Because of its central role in the flow of 
genetic information from DNA to RNA to 
protein, the NPC is commonly targeted in viral 
infections and its nucleoporin constituents are 
associated with a plethora of diseases. 


RATIONALE: Although the arrangement of most 
scaffold nucleoporins in the NPC’s symmetric 
core was determined by quantitative docking 
of crystal structures into cryo-electron tomo- 
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graphic (cryo-ET) maps of intact NPCs, the 
topology and molecular details of their cohe- 
sion by multivalent linker nucleoporins have 
remained elusive. Recently, in situ cryo-ET re- 
constructions of NPCs from various species 
have indicated that the NPC’s inner ring is 
capable of reversible constriction and dilation 
in response to variations in nuclear envelope 
membrane tension, thereby modulating the 
diameter of the central transport channel by 
~200 A. We combined biochemical reconstitu- 
tion, high-resolution crystal and single-particle 
cryo-electron microscopy (cryo-EM) structure 
determination, docking into cryo-ET maps, and 
physiological validation to elucidate the molec- 
ular architecture of the linker-scaffold interac- 
tion network that not only is essential for the 
NPC's integrity but also confers the plasticity and 
robustness necessary to allow and withstand 
such large-scale conformational changes. 


RESULTS: By biochemically mapping scaffold- 
binding regions of all fungal and human linker 
nucleoporins and determining crystal and 
single-particle cryo-EM structures of linker- 


Linker-scaffold architecture in the human NPC’s symmetric core. Near-atomic composite structure of 
the NPC’s symmetric core obtained by quantitative docking of high-resolution crystal and single-particle 
cryo-EM structures into a cryo-ET reconstruction of the intact human NPC. Schematic representations of 
the intricate linker-scaffold topology of the cytoplasmic outer ring, inner ring, and nuclear outer ring 
(clockwise from top) are depicted for the boxed regions. C, C terminus; N, N terminus. 
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scaffold complexes, we completed the char- 
acterization of the biochemically tractable 
linker-scaffold network and established its 
evolutionary conservation, despite considerable 
sequence divergence. We determined a series of 
crystal and single-particle cryo-EM structures of 
the intact Nup188 and Nup192 scaffold hubs 
bound to their Nic96, Nup145N, and Nup53 
linker nucleoporin binding regions, reveal- 
ing that both proteins form distinct question 
mark-shaped keystones of two evolutionarily 
conserved hetero-octameric inner ring com- 
plexes. Linkers bind to scaffold surface pockets 
through short defined motifs, with flanking 
regions commonly forming additional disperse 
interactions that reinforce the binding. Using a 
structure-guided functional analysis in Sac- 
charomyces cerevisiae, we confirmed the ro- 
bustness of linker-scaffold interactions and 
established the physiological relevance of our 
biochemical and structural findings. The near- 
atomic composite structures resulting from 
quantitative docking of experimental struc- 
tures into human and S. cerevisiae cryo-ET 
maps of constricted and dilated NPCs struc- 
turally disambiguated the positioning of the 
Nup188 and Nup192 hubs in the intact fungal 
and human NPC and revealed the topology of 
the linker-scaffold network. The linker-scaffold 
gives rise to eight relatively rigid inner ring 
spokes that are flexibly interconnected to al- 
low for the formation of lateral channels. Un- 
expectedly, we uncovered that linker-scaffold 
interactions play an opposing role in the outer 
rings by forming tight cross-link staples be- 
tween the eight nuclear and cytoplasmic outer 
ring spokes, thereby limiting the dilatory move- 
ments to the inner ring. 


CONCLUSION: We have substantially advanced 
the structural and biochemical characterization 
of the symmetric core of the S. cerevisiae and 
human NPCs and determined near-atomic 
composite structures. The composite structures 
uncover the molecular mechanism by which 
the evolutionarily conserved linker-scaffold es- 
tablishes the NPC’s integrity while simultane- 
ously allowing for the observed plasticity of the 
central transport channel. The composite struc- 
tures are roadmaps for the mechanistic dissection 
of NPC assembly and disassembly, the etiology 
of NPC-associated diseases, the role of NPC 
dilation in nucleocytoplasmic transport of solu- 
ble and integral membrane protein cargos, and 
the anchoring of asymmetric nucleoporins. 
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Nuclear pore complexes (NPCs) mediate the nucleocytoplasmic transport of macromolecules. Although the 
arrangement of the structured scaffold nucleoporins in the NPC’s symmetric core has been determined, 
their cohesion by multivalent unstructured linker nucleoporins has remained elusive. Combining biochemical 
reconstitution, high-resolution structure determination, docking into cryo—electron tomographic 
reconstructions, and physiological validation, we elucidated the architecture of the evolutionarily 
conserved linker-scaffold, yielding a near-atomic composite structure of the human NPC’s ~64-megadalton 
symmetric core. Whereas linkers generally play a rigidifying role, the linker-scaffold of the NPC provides the 
plasticity and robustness necessary for the reversible constriction and dilation of its central transport 
channel and the emergence of lateral channels. Our results substantially advance the structural 
characterization of the NPC symmetric core, providing a basis for future functional studies. 


he enclosure of genetic material in the 

nucleus requires the selective transport 

of folded proteins and ribonucleic acids 

across the nuclear envelope, for which 

the nuclear pore complex (NPC) is the 
sole gateway (J-4). Beyond its function as a 
selective, bidirectional channel for macro- 
molecules, the role of the NPC extends to ge- 
nome organization, transcription regulation, 
mRNA maturation, and ribosome assembly 
(CZ, 2). The NPC and its components are impli- 
cated in the etiology of many human diseases, 
including viral infections (5, 6). The building 
blocks of the NPC are a set of ~34 different 
proteins collectively termed nucleoporins (nups). 
In the NPC, nups assemble into defined sub- 
complexes that are generally present in multi- 
ples of eight, adding up to a mass of ~110 MDa 
in the human NPC (/-4). The NPC architecture 
consists of a symmetric core with asymmetric 
decorations on its nuclear and cytoplasmic 
faces (Fig. 1A). The symmetric core displays 
eight- and twofold rotational symmetry about 
the nucleocytoplasmic axis and axes coplanar 
with the nuclear envelope, respectively. It con- 
sists of two outer rings that sit on top of the 
nuclear envelope and an inner ring that lines 
the lumen generated by the fusion of the two 
lipid bilayers of the nuclear envelope. From 
the inner ring, unstructured phenylalanine- 
glycine (FG) repeats are projected into the 
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central transport channel to establish the dif- 
fusion barrier. Although ~40 kDa has histori- 
cally been considered the threshold for passive 
diffusion (7, 8), the size selectivity of the barrier 
shows a more gradual dependence on molec- 
ular mass (9). Active transport is generally 
mediated by karyopherins, whose affinity for 
FG repeats and ultrafast exchange kinetics al- 
low karyopherin-bound cargo to traverse the 
diffusion barrier (J0-13). 

The structural characterization of the NPC 
has progressed through efforts to reconstitute 
and crystallize ever larger portions of it, from 
small nup domain fragments to complexes 
as large as the ~400-kDa heteroheptameric 
Y-shaped coat nup complex (CNC) (74-30). In 
parallel, progress has been driven by efforts 
to push the resolution of cryo-electron tomo- 
graphic (cryo-ET) reconstructions of intact 
NPCs (37). The docking of the CNC crystal 
structure into an ~32-A cryo-ET map of the 
intact human NPC was the first demonstra- 
tion that biochemical reconstitution and 
crystal structures could be used to interpret 
cryo-ET maps and unraveled the head-to-tail 
tandem arrangement of CNCs in the outer 
rings (29, 32). The reconstitution and piece- 
meal structural analysis of two heteronona- 
meric ~425-kDa inner ring complexes provided 
the basis for docking 17 symmetric core nups 
into an ~23-A cryo-ET map of the intact human 
NPC (28, 33), yielding a near-atomic composite 
structure of the entire ~56-MDa symmetric 
core of the human NPC (34, 35). Subsequently, 
a similar approach elucidated the near-atomic 
architectures of constricted and dilated states 
of the Saccharomyces cerevisiae NPC using 
crystal structures to interpret ~25-A cryo-ET 
maps (36, 37). Apart from an additional dis- 
tal CNC ring and associated nups present in 
the outer rings of the human NPC, the human 
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and S. cerevisiae NPCs present equivalent nup 
arrangements (34-38). 

The inner ring of the human NPC is com- 
posed of six scaffold nups called NUP155, 
NUP188, NUP205, NUP54, NUP58, and NUP62; 
two primarily unstructured linker nups called 
NUP53 and NUP98; and NUP93, which is a 
hybrid of both (34, 35). The doughnut-shaped 
inner ring adopts a concentric cylinder archi- 
tecture, in which membrane-anchored NUP155 
forms the outermost coat, followed by layers of 
NUP93, NUP205 or NUP188, and the NUP54+ 
NUP58*°NUP62 channel nucleoporin hetero- 
trimer (CNT) in the center, providing the FG 
repeats to form the diffusion barrier in the 
central transport channel. Unlike the exten- 
sive interactions of large, folded domains 
found in the CNC (16-20, 22, 25, 27, 29, 39), 
the structured domains of the inner ring nups 
do not interact directly. Instead, the inner ring 
is held together by the linker nups NUP53 and 
NUP98 and the linker region of NUP93, which 
are proposed to connect the scaffolds of the 
four layers (28, 33, 35, 40-42). The resulting 
linker-scaffold architecture allows for a sub- 
stantial ~200-A dilation of the inner ring’s 
central transport channel, accompanied by 
the generation of lateral channels between 
the eight spokes, as observed in recent cryo-ET 
analyses of purified and in situ human and 
fungal NPCs (36, 43-46). The linker-scaffold is 
expected to play a fundamental role in estab- 
lishing an architectural framework to accom- 
modate the structural changes associated with 
the reversible constriction and dilation of the 
inner ring. 

Whereas our previous work achieved the 
identification of most of the scaffold nup lo- 
cations in the NPC, a comprehensive under- 
standing and the molecular details of the 
linker-scaffold interaction network that medi- 
ates the cohesion of the symmetric core have 
remained elusive. Here, we report the char- 
acterization of the complete tractable set of 
linker-scaffold interactions through residue- 
level biochemical mapping of scaffold-binding 
regions in linker nups and the determination 
of crystal and single-particle cryo-electron 
microscopy (cryo-EM) structures of linker- 
scaffold complexes. Our analysis revealed a 
common linker-scaffold binding mode, where- 
by linkers are anchored by central structured 
motifs whose binding is reinforced by disperse 
interactions of flanking regions. We quantita- 
tively docked the complete set of linker-scaffold 
structures into an ~12-A cryo-ET reconstruction 
of the human NPC (provided by Martin Beck’s 
group) (47) and an ~25-A in situ cryo-ET re- 
construction of the S. cerevisiae NPC (36). In 
the inner ring, our new linker-scaffold struc- 
tures allowed for the unambiguous assign- 
ment of NUP188 and NUP205 to 16 peripheral 
and 16 equatorial positions, respectively. From 
the nuclear envelope to the central transport 
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Fig. 1. Outline of the symmetric core inner ring architecture. (A) Cross-sectional schematic of the NPC architecture. POMs, integral membrane proteins of the 
pore membrane domain. (B) Domain structures of the C. thermophilum inner ring linker and scaffold nups. Nic96 consists of linker (residues 1 to 390) and scaffold 
(residues 391 to 1112) regions. Nomenclature for nup homologs from C. thermophilum, S. cerevisiae, and H. sapiens is indicated. Multiple paralogs exist for some 
S. cerevisiae nups. (© and D) Schematic map of previously established linker-scaffold interactions in alternative, mutually exclusive inner ring complexes organized 
around Nup192 and Nup188 scaffold hubs (35). Black lines connecting colored bars indicate interactions between nup regions. C, C terminus; N, N terminus. 


channel, linkers bridge the layers of the in- 
ner ring to coalesce scaffold nups into eight 
relatively rigid spokes that are flexibly in- 
terconnected, allowing for the formation of 
lateral channels. The linker-scaffold confers 
the plasticity necessary for the reversible 
dilation and constriction of the inner ring in 
response to alterations in nuclear envelope 
membrane tension. The topology of linker- 
scaffold interactions between inner ring nups 
is conserved from fungi to humans. We carried 
out systematic functional analyses of the 
linker-scaffold network, including the devel- 
opment of a minimal linker S. cerevisiae strain, 
establishing its robustness and essential na- 
ture. Our quantitative docking analysis of the 
human NPC revealed eight NUP205*NUP93 
complexes that cross-link adjacent spokes in 
both nuclear and cytoplasmic outer rings. 
Facing the central transport channel, addi- 
tional eight NUP205*NUP93 copies are exclu- 
sively anchored at the base of the cytoplasmic 
outer ring. NUP93 emerges as a versatile linker- 
scaffold hybrid that recruits and positions the 
FG repeat-harboring CNT to the inner ring and 


reinforces the tandem head-to-tail CNC arrange- 
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ment in the outer rings, explaining its funda- 
mental role in maintaining the integrity of the 
entire NPC. Our analysis substantially advances 
the structural characterization of the ~64-MDa 
symmetric core and lays out a roadmap for fu- 
ture studies on the NPC assembly and function. 


Results 

Biochemical and structural analysis 
of Chaetomium thermophilum 
linker-scaffold interactions 


Nup192 and Nup188 are question mark-shaped 
scaffold keystones of two alternative eight- 
protein inner ring complexes, both of which 
include the linkers Nup145N, Nup53, and the 
scaffold-linker hybrid Nic96 (Fig. 1, B to D) 
(28, 33, 35, 40, 41). Previously, a composite 
structure of the full-length ~200-kDa Nup192 
was determined by superposing overlapping 
structures of its N- and C-terminal parts, re- 
vealing an extended o-helical solenoid with 
a question mark-shaped architecture, com- 
posed of 5 HEAT and 15 ARM repeats and a 
prominent central Tower (35, 40, 41, 48). High- 
resolution structures of Nup188 encompass- 
ing the ~130-kDa N-terminal domain (NTD), 
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which contains a central SH3-like domain in- 
sertion, and the ~45-kDa C-terminal Tail region 
again revealed extended a-helical solenoids 
composed of ARM and HEAT repeats (28, 49). 
However, no structural information could so 
far be obtained for the ~32-kDa Nup188 central 
region equivalent to the Nup192 Tower. Although 
binding of Nup192 and Nup188 has been bio- 
chemically mapped to the Nic96, Nup53, and 
Nupl45N linkers, the molecular details are un- 
known. To gather these details, which are crucial 
to the elucidation of the linker-scaffold archi- 
tecture, we performed the following compre- 
hensive biochemical and structural analyses. 


Nup192 interaction with Nic96 


Using a crystallizable Nup192484 (residues 153 
to 1756) fragment (35), we obtained cocrystals 
with our previously biochemically mapped 
Nicg96'*”*"! fragment that diffracted to 3.6-A 
resolution (fig. S1). Nic96 residues 187 to 239 
were not resolved and were found to be dispen- 
sable for Nup192 binding by isothermal titration 
calorimetry (ITC), because both Nic96'8”-3"" 
and Nic96"? (residues 240 to 301) retained a 
dissociation constant (Kp) of ~’75 nM (Fig. 2E 
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magnified to illustrate the molecular details of the Nup188-Nic96* 
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and fig. $2, A and D). The Nic96® sequence 
register was unambiguously assigned by iden- 
tifying seleno-t-methionine (SeMet)-labeled 
residues in anomalous difference Fourier 
maps (fig. S1). To obtain the structure of full- 
length Nup192*Nic96™, we determined a single- 
particle cryo-EM reconstruction at 3.8-A global 
resolution from a refined set of 176,609 par- 
ticles whose preferential orientation resulted in 
an anisotropic 3.5- to 4.0-A directional Fourier 
shell correlation (FSC) resolution range (Fig. 2B 
and fig. S3). Nic96® forms two amphipathic o 
helices connected by a sharply kinking loop 
that extend from the midpoint to the base of 
the Nup192 question mark-shaped o-helical 
solenoid. The longer N-terminal o helix is cra- 
dled by the concave Nup192 surface formed by 
10 ARM and HEAT repeats and the central 
Tower, whereas the shorter C-terminal a helix 
packs against a hydrophobic patch formed 
from the C-terminal Nup192 a helices «75 to 
a77 (ARM-20). Comparison of our crystal and 
cryo-EM structures confirmed the molecular 
details of Nic96®™ binding and identified a 
conformational difference in the width of the 
gap between the Nup192 Head and Tower 
subdomains (fig. $4). To validate the molecu- 
lar details of the Nup192-Nic96™ interface, we 
performed structure-guided mutagenesis and 
assessed binding by size exclusion chromatog- 
raphy coupled to multiangle light scattering 
(SEC-MALS) and ITC. Consistent with an 
~3700-A” hydrophobic interface, binding was 
not strongly affected by individual substitu- 
tions and was only abolished by Nic96 FFF 
or Nup192 LAF combination mutations (F, 
Phe; L, Leu; A, Ala; Fig. 2, C to F, and figs. $2 
and S5 to S7). 


Nup188 interaction with Nic96 


Next, we tested whether the same Nic96 re- 
gion was sufficient for Nup188 binding. Indeed, 
ITC measurements revealed that Nup188 
binds both Nic96™ and Nic96"*"°™ with 
similar Kps of ~90 nM (Fig. 2J and fig. S8, A 
and D). We determined crystal structures of 
Nup188-Nic96® and Nup188‘"” (residues 1 to 
1134) at 4.4- and 2.8-A resolution, respectively, 
the latter of which aided with phasing and 
model building. The Nup188 and Nic96™ se- 
quence registers were unambiguously assigned 
by identifying SeMet-labeled residues in anom- 
alous difference Fourier maps (Fig. 2G and 
fig. S9). Like Nup192, Nup188 adopts an over- 
all question mark-shaped architecture, com- 
posed of an N-terminal Head subdomain, 9 
HEAT repeats, 13 ARM repeats, and a central, 
comparatively more compact Tower (Fig. 2G). 
Similarly, Nic96® binds a concave surface be- 
tween the midpoint and the base of the Nup188 
question mark-shaped o-helical solenoid, bury- 
ing ~3700 A” of combined surface area. Al- 
though Nup188-bound Nic96*” also forms two 
amphipathic o helices, the a helices start and 
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end at different residues, resulting in a sec- 
ondary structure that radically differs from the 
Nup192-bound form (Fig. 2, B and G). Nup188- 
bound Nic96"” has a shorter N-terminal helix 
that binds to the central Tower and a longer 
C-terminal helix cradled in the concave surface 
at the base of Nup188 (Fig. 2G). Notably, the 
Nic96®? FFF mutation that abolishes Nup192 
binding had the same effect on Nup188 binding, 
despite the structural polymorphism between 
Nup192- and Nup188-bound Nic96*? (Fig. 2, 
H to K, and figs. S8B, S10, and $12). Analogous 
to the Nup192 LAF mutant, we identified a 
triple Nup188 FLV substitution that disrupted 
Nic96" binding (V, Val; Fig. 2, I to K, and figs. 
S8C, S11, and S12). 


Nup192 interaction with Nup145N 
and Nup53 


To identify the Nup145N regions necessary 
and sufficient for Nup192 binding, we per- 
formed a five-alanine scanning mutagenesis 
and truncation analysis of Nup145N (Fig. 
3A). Substituting five consecutive residues at 
a time to alanines, we found a hotspot be- 
tween residues 626 and 655 that displayed 
diminished binding to Nup192 (Fig. 3A and 
fig. S13). N- and C-terminal Nup145N trun- 
cation resulted in a minimal Nup145N™! pep- 
tide (residues 616 to 683) that recapitulated 
the Nup192-Nup145N interaction, although 
shorter Nup145N fragments showed residual 
binding to Nup192 (Fig. 3B and fig. S14). ITC 
measurements confirmed that Nup192 bind- 
ing is primarily sustained by Nup145N’s R1 
region, with Kps of ~825 and ~1600 nM for 
Nup145N and Nupl45N™ binding, respectively 
(Fig. 3G and fig. S15). 

With our previously mapped minimal Nup53™ 
fragment (residues 31 to 67) (41), we reconsti- 
tuted an ~220-kDa Nup192-Nic96*’«Nupl45N*" 
Nup53"" complex and obtained a single-particle 
cryo-EM reconstruction at 3.2-A global resolu- 
tion from a selected set of 484,910 particles 
whose preferential orientation resulted in an 
anisotropic 3.1- to 3.6-A directional FSC resolu- 
tion range (Fig. 3C and fig. S16). For Nup53™, 
the cryo-EM map only resolved the central 
phenylalanine-glycine (FG) dipeptide buried 
in a hydrophobic pocket at the top of the 
Nup192 molecule. Key contacts involve Leu*** 
and Trp*”? of Nup192 and Phe*® of Nup53, 
consistent with our previous identification 
of these residues as required for the Nup53*'- 
Nup192 interaction by systematic mutagenesis 
(Fig. 3D) (41). The Nup145N™" binding site is 
proximal to the Nic96"” binding site, at the 
midpoint of the question mark-shaped Nup192, 
where a hydrophobic pocket anchors the 
Nupl45N™! MYKL motif (residues 633 to 636; 
M, Met; Y, Tyr; K, Lys) that runs perpendicular 
to the long axis of the question mark, with its 
N terminus oriented toward the N terminus 
of Nic96™ (Fig. 3E). Overall, comparison of the 
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Nup192 structures in complex with different 
linkers demonstrated that linker binding does 
not induce conformational rearrangements in 
the scaffold Nup192 (fig. S17). 

Validation of the Nup192-Nup145N*" inter- 
face through structure-guided mutagenesis con- 
firmed the importance of the central hydrophobic 
Nup145N MYKL anchor motif (Fig. 3F and fig. 
$18), but complete ablation of binding was only 
observed when the three flanking basic residues 
on either side were also mutated to alanine in 
the 10-residue KKR-MYKL-RKR mutant (R, Arg) 
(Fig. 3, F to H, and figs. S15 and S18 to S20). 
Conversely, mutagenesis of the Nupl45N™ 
MYKL binding site in Nup192 identified a 
quadruple Nup192 LIFH mutant that specifi- 
cally abolished Nup192 binding to Nup145N™ 
but not Nic96"” or Nup53 (I, Ile; H, His; Fig. 3, 
F and H, and figs. S19 to S21). Although basic 
residues flanking both Nup145N’s MYKL and 
Nup53’s FG (47) anchor motifs contribute to 
Nup192 binding, flanking residues were not 
resolved in the cryo-EM density. 


Nup188 interaction with Nup145N 


To identify the Nupl45N regions necessary 
and sufficient for Nup188 binding, we used 
a five-alanine scanning and fragment trun- 
cation approach analogous to the mapping 
of the Nupl45N-Nup192 interaction. This 
identified a minimal Nupl45N*” peptide (res- 
idues 640 to 732) that recapitulated wild-type 
binding and a region between residues 706 
and 715 that affected Nup188‘7” binding upon 
five-alanine substitution (Fig. 4, A and B, and 
figs. S22 and $23). Consistent with our previous 
findings (28), we also confirmed that Nup188 
does not bind Nup53, even in the presence of 
Nic96® and Nupl45N (fig. $24). 

We determined the structure of an ~220-kDa 
Nup188-Nic96*”-Nupl45N” complex by single- 
particle cryo-EM. An initial set of 709,123 
preferentially oriented particles produced a 
reconstruction of Nup188-Nic96™ at 2.4-A glob- 
al resolution and an anisotropic 2.3- to 2.5-A 
directional FSC resolution range. Local three- 
dimensional classification of particles based 
on emergent excess density at the top of the 
question mark-shaped Nup188 molecule iden- 
tified a subset of 298,317 particles that yielded 
a reconstruction of Nup188-Nic96*’*Nupl45N' 
at 2.8-A global resolution and an anisotropic 
2.7- to 2.9-A directional FSC resolution range 
(Fig. 4C and fig. $25). Nup145N® buries res- 
idues Ie”, Leu”°, and Phe” in a hydrophobic 
cradle adjacent to the SH3-like domain. As 
with Nup145N™ bound to Nup192, only a cen- 
tral portion of the Nupl45N® peptide was 
resolved (residues 706 to 718) (Fig. 4C). No 
meaningful conformational changes were ob- 
served between the different Nup188 struc- 
tures, in response to linker binding (fig. S26). 

Substitution of two resolved Nup145N resi- 
dues, Leu”°—Ala (L710A) and Phe”™°—Ala 
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Fig. 3. Structural and biochemical analyses of the Nup192-Nup145N inter- 
action. (A) Domain structures of C. thermophilum Nup192, Nic96, Nup53, 

and Nupl45N and the effect of each five-alanine substitution on Nup145N binding 
to Nup192, as assessed by SEC and indicated by colored boxes above the 
Nup145N primary sequence. (B) Summary of SEC binding analysis identifying 
the minimal Nup145N™ (red) region sufficient for Nup192 binding. +++, no effect; 
++, weak effect; +, moderate effect; -, abolished binding. (C to E) Cartoon 
representation of (C) the 3.2-A C. thermophilum Nup192+Nic96**+Nup53°"+Nupl45N™ 
single-particle cryo-EM structure. Insets indicate regions magnified to illustrate 
molecular details of (D) the Nup192-Nup53" and (E) the Nup192-Nup145N"! 
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interactions. Red circles indicate residues involved in the Nup192-Nup53™ 

(41) and Nup192-Nup145N* interactions. (F) Effect of Nup145N alanine 
substitutions (cyan squares) on Nup192 binding, assayed by SEC (left). Effect 
of structure-guided Nup192 alanine substitutions on SUMO-Nup145N*! binding, 
assayed by SEC (right). (G) Kps determined by triplicate ITC experiments, 
with the mean and associated standard error reported. (H) SEC-MALS analysis 
of Nup192*SUMO-Nic96"*+Nup53*Nupl45N and Nup192*SUMO-Nic96"*+ 
Nup53*SUMO-Nup145N*! complex formation and disruption by mutants. 
Measured molecular masses are indicated, with respective theoretical masses 
provided in parentheses. 
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Fig. 4. Structural and biochemical analyses of the Nup188-Nup145N interac- Nupl45N* interaction. Red circles indicate residues involved in the Nup183- 
tion. (A) Domain structures of C. thermophilum Nup188, Nic96, and Nup145N and 
the effect of each five-alanine substitution on Nup145N binding to Nup188%", as on Nup188%"° binding, assayed by SEC (left). Effect of structure-guided Nup188%™ 
assessed by SEC and indicated by colored boxes above the Nup145N primary alanine substitutions on SUMO-Nupl45N® binding, assayed by SEC (right). 
sequence. (B) Summary of SEC binding analysis identifying the minimal Nupl45N®* —_(E) Kps determined by triplicate ITC experiments, with the mean and associated 
(red) region sufficient for binding to Nup188“". +++, no effect; ++, weak effect; standard error reported. (F) SEC-MALS analysis of Nup188*SUMO-Nic96°2+Nup145N 
+, moderate effect; -, abolished binding. (C) Cartoon representation of the 2.8-A and Nup188*SUMO-Nic96"**SUMO-Nup145N®* complex formation and disruption 
C. thermophilum Nup188+Nic96°**Nup145N** single-particle cryo-EM structure. by mutants. Measured molecular masses are indicated, with respective theoretical 
Inset indicates region magnified to illustrate molecular details of the Nup188- masses provided in parentheses. 
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(F715A), moderately disrupted Nup1s8\7” 
binding (Fig. 4D and fig. S27). Further sys- 
tematic mutagenesis led to a Nup145N EDSILF 
mutant, which respectively abolished and re- 
duced Nupi88 binding to Nup145N*? and 
Nupl45N (E, Glu; D, Asp; S, Ser; Fig. 4, E and F, 
and figs. S28 to S30). Structure-guided muta- 
genesis of Nup188 residues interfacing with 
Nupl45N™ identified a Nup188 HHMI mutant 
that abolished binding to Nup145N*” but not to 
Nupl45N (Fig. 4, D to F, and figs. S28 to S30). 
Overall, the greater tolerance of the Nup188- 
Nup145N interaction to binding site mutations 
demonstrates an even greater reliance on pro- 
miscuous binding events in flanking regions 
dispersed well beyond the structurally resolved 
core anchor motif. 


Comparison of the Nup192- and 
Nup188-linker complexes 


The determination of full-length structures of 
both Nup192 and Nup188 scaffolds bound to 
their respective linkers permits a direct com- 
parison of these two distantly homologous 
a-helical solenoids (~28% sequence similar- 
ity) (Movie 1). Although both structures share 
the same overall question mark-shaped archi- 
tecture, the Nup188 o-helical solenoid displays 
a tighter superhelical twist, resulting in an 
~10-A narrower molecule with a compacted 
N-terminal ring (fig. S31, A and B). A Tower 
protrudes from the midpoint of the o-helical 
solenoid toward the Head subdomain in both 


Nup192 complex 


Nupi45N™ @ 


© Ks J 


structures, extending further in Nup192 than 
the comparatively compressed Nup188 ver- 
sion. Nic96™ binds both scaffolds at the base 
of the question mark but notably adopts dif- 
ferent secondary structures, switching between 
which requires breaking and reforming of a 
helices (fig. S31C). By contrast, Nup145N binds 
to different parts of Nup192 and Nup188, at the 
midpoint and the top of the question mark- 
shaped molecules, respectively. Interestingly, 
the Nup145N"” binding site at the top of Nup188 
is nearly congruent with that of the Nup53"" on 
Nup192 (fig. S31A). 

Our structures and biochemical analysis 
identify two distinct types of linker-scaffold 
interactions. Nic96®” binds with high affin- 
ity, using the same well-defined ~60-residue 
motif in binding to both Nup192 and Nup188. 
On the contrary, Nup145N binds to Nup192 
and Nup188 through protracted, overlapping 
binding regions and a distinctive common 
binding mode: Both interactions depend on a 
structurally defined ~10-residue Nup145N an- 
chor motif, yet tight binding requires extensive 
~20- to 60-residue N- and C-terminal flanking 
regions with high basic character. The Nup53- 
Nup192 interaction relies on a similar binding 
mode. The evasiveness of these interaction- 
enhancing linker flanking regions to struc- 
tural characterization suggests that their 
binding to scaffold surfaces is highly dynamic 
and promiscuous. Notably, the previously char- 
acterized ~14-residue Nup170-binding motif of 


Nup188 complex 
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Movie 1. Structural analysis of the Nup192 and Nup188 inner ring complexes. Comparison of crystal 
and single-particle cryo-EM structures of C. thermophilum Nup192 and Nup188 scaffolds in complex with 
Nic96, Nup145N, and Nup53 linkers. Cryo-EM densities are rendered as isosurfaces colored according to their 


assigned protein chain. 
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Nup145N (residues 729 to 750) does not de- 
pend on binding-enhancing flanking re- 
gions, suggesting that the uncovered mode of 
Nup145N and Nup53 binding to the Nup192 
and Nup188 scaffolds is a desirable evolu- 
tionary outcome and architectural principle 
of the NPC inner ring. 

Together, these data complete the struc- 
tural and biochemical characterization of the 
biochemically tractable linker-scaffold inter- 
actions. Nup145N binds at distinct sites on 
Nup192 and Nup188, forming mutually ex- 
clusive interactions with either Nup192 and 
Nup170 or Nup188 through extensive overlap- 
ping binding sites mapped to Nup192 (RI, 
residues 616 to 683), Nup188 (R2, residues 
640 to 732) and Nup170 (R3, residues 729 to 
750). Binding by means of a central anchor 
motif enhanced by extensive flanking regions 
is reminiscent of Velcro, in which weak bind- 
ing events accumulate to build a robust yet 
flexible interaction with manifold productive 
binding configurations possible in terms of 
both spatial distribution and occupancy. As an 
architectural principle, Velcro-like binding could 
accommodate scaffold movements without 
entirely breaking the linker-scaffold, main- 
taining the NPC’s integrity in face of large- 
scale dilation or constriction. 


Architecture of the S. cerevisiae linker-scaffold 


The inner ring of the NPC contains eightfold 
rotational symmetry about a nucleocytoplasmic 
axis and twofold symmetry in the plane of the 
nuclear envelope (34, 35). In the S. cerevisiae 
(sc) NPC, each of the 16 inner ring protomers 
were proposed to consist of a scNup192 and a 
scNup188 inner ring complex (Fig. 1, C and 
D), with scNup192 and scNup188 located at the 
equatorial and peripheral positions, respec- 
tively (36, 37). High-confidence quantitative 
docking of our full-length experimental Nup192¢ 
Nic96**-Nup145N*'-Nup53™ and Nup188« 
Nic96"?-Nup145N” structures in an ~25-A 
in situ cryo-ET map of the S. cerevisiae NPC (fig. 
$32) (36) confirmed these proposals. Whereas 
docking of the folded scaffolds Nup170, Nic96, 
Nup192, Nup188, and the CNT into cryo-ET 
maps of intact NPCs revealed their positioning 
to form four concentric cylinders, the linker 
network that connects them has remained 
elusive. Combined with our previously deter- 
mined structures of Nup170eNup53"", Nup170« 
Nupl45N®*, Nic96-Nup53™”, and CNTeNic96™ 
(28, 35), the Nupl92eNic96’-Nupl45N™"-Nups3" 
and Nup188-Nic96*?-Nup145N” structures al- 
lowed us to identify the locations of all scaffold- 
bound linker regions. We considered whether 
the length of linker polypeptides connecting 
pairs of scaffold-bound linker segments con- 
strained the topology of linker connections. We 
found a single topology connecting linker seg- 
ments related by the shortest Euclidean dis- 
tance. For a detailed description of these 
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results, see the supplementary text (figs. S32 
to S37). 

Together, these data elucidated the archi- 
tecture of the S. cerevisiae inner ring linker- 
scaffold (Fig. 5A and Movie 2). Apart from 
spoke bridging mediated by Nup53 orthologs, 
all linker-scaffold connections in the S. cerevisiae 
inner ring occur within the same spoke, there- 
by allowing interspoke gaps to form. The 
linker-scaffold architecture provides a molec- 
ular explanation for the inner ring’s ability to 
exist in constricted and dilated states (36, 37). 


The S. cerevisiae linker-scaffold is robust 
and essential 


Owing to ancestral gene duplication events 
in S. cerevisiae, there are several linker and 
scaffold nup paralogs, including scNup145N 
paralogs scNup116 and scNup100, scNup53 
paralog scNup59, and scNup170 paralog scNup157 
(Fig. IB) (2). The scNupl00 and scNupli6 para- 
logs contain sequences homologous to the 
Nup192, Nup188, and Nup170 binding regions 
characterized in C. thermophilum Nup145N, 
but only scNup116 possesses the Gle2 binding 
site (GLEBS) motif (fig. $38) (50). 

To interrogate the function of individ- 
ual scaffold-binding regions in the linker 
scNupl116, we established a S. cerevisiae mini- 
mal nupl00Anupli6Anup145A strain com- 
plemented with scNupl16 and scNup145C, 
ectopically expressed from centromeric plas- 
mids (Fig. 5, B and C, and fig. S39). Next, we 
systematically mutated all functional ele- 
ments in the scNup116 sequence, including 
the scNup192, scNup188, and scNup157/170 
scaffold-binding regions R1, R2, and R3, re- 
spectively, with three types of mutations: 
deletions (ARI, AR2, and AR3), substitutions 
with glycine-serine (GS) linkers of equivalent 
length (R1/40xGS, R2/40xGS, and R3/12xGS), 
or substitutions of sequence-conserved residues 
shown to disrupt binding of the C. thermophilum 
NupI45N to the respective scaffolds (Rim, R2m, 
and R3m) (Fig. 5B and fig. $38). Deletions 
and GS-linker substitutions, being aggressive 
types of mutations, were lethal if targeting R1 
and affected growth, mRNA export, and 60S 
preribosome export if introduced in R2 and 
R3. The less aggressive combination of sub- 
stitutions, Rim, caused substantial yet non- 
lethal phenotypic effects, which were further 
exacerbated through combination with R2m 
(Rim+R2m) or R3m (RIm+R3m), culminating 
with the lethal Rim+R2m+R3m triple muta- 
tion (Fig. 5, D and E, and fig. S40). Interestingly, 
all scNup116 mutations resulted in temperature- 
dependent loss of enhanced green fluorescent 
protein (eGFP)-scNup116 from the nuclear 
envelope rim and concomitant emergence of 
eGFP-scNup116 foci (Fig. 5D and fig. S40F), as 
previously reported (57-53). 

Next, we transposed the insight from our 
structural and biochemical characterization 
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of the C. thermophilum linker-scaffold into 
equivalent substitutions of conserved resi- 
dues or more aggressive truncations of bind- 
ing site-harboring subdomains of scNup192, 
scNup188, and scNic96 (Fig. 5F). Notably, the 
combination of LAF and LIFH substitutions 
(LAF+LIFH) that ablated Nup192 binding to 
Nic96®? and Nupl45N®", respectively, failed 
to rescue the lethal nwp192A phenotype (Fig. 
5G and fig. S41 and S42). Analogously, the 
combination of FLV and HHMI substitutions 
(FLV+HHM)I) that ablated Nup188 binding to 
Nic96™ and Nup45N*™”, respectively, led to an 
additive cold-sensitive slow-growth phenotype 
with mRNA and 60S preribosome export de- 
fects in the synthetic lethal nwpl88Apom34A 
strain (Fig. 5, G and H, and figs. S43 and S44) 
(54-56). Finally, we introduced the transposed 
FFF substitutions of evolutionarily conserved 
hydrophobic residues that abolished Nic96"” 
binding to Nup192 and Nup188 into scNic96, 
along with scNic96™ deletion (AR2) (Fig. 5F 
and fig. S45). Surprisingly, neither mutation 
resulted in a detectable phenotype in a nic96A 
strain (Fig. 5, F to H, and fig. S46). The com- 
posite structure of the NPC linker-scaffold sug- 
gests that Nic96® binding to Nup192 and 
Nup188 restricts the diffusive path of the N- 
terminal Nic96 linker, thereby correctly posi- 
tioning the Nic96™" assembly sensor that 
recruits the CNT complex (Fig. 5J). We rea- 
soned that CNT mispositioning would affect 
the spatial distribution and local concen- 
tration of FG repeats, with consequences on 
nucleocytoplasmic transport. Therefore, we 
replaced the Nic96"? region with GS-linkers 
matching the number of residues (R2/66xGS) or 
approximating its a-helical length (R2/32xGS) 
(Fig. 5F). Despite not affecting CNT recruit- 
ment by the Nic96"" region, the R2/66xGS 
and R2/32xGS mutations resulted respec- 
tively, in lethal and severely deleterious effects 
on growth, mRNA export, and 60S preribo- 
some export (Fig. 5, G to I, and fig. S46). For a 
detailed description of these results, see the 
supplementary text. 

Taken together, these data demonstrate the 
physiological relevance of our residue-level 
biochemical and structural characterization 
of Nup192 and Nup188 as keystone scaffold 
hubs of the inner ring that integrate connec- 
tions between the membrane-coating Nup170 
layer and the central transport channel- 
interfacing CNT layer through respective in- 
teractions with Nup145N and the N-terminal 
Nic96 linker regions. The wild-type phenotype 
of the nup53Anup59A strain (57) precludes 
analysis of the scNup53 and scNup59 inter- 
actions in S. cerevisiae. However, this fact, 
coupled with our knockout of all but one of 
the scNup145N paralogs, highlights the ro- 
bustness of the S. cerevisiae inner ring archi- 
tecture, which can tolerate a considerable loss 
of linker-scaffold interactions. Robustness is 
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also found in nup-nup interactions, whereby 
perturbing a linker-scaffold interaction re- 
quires multiple residue substitutions in both 
structured motifs and flanking linker regions. 


Evolutionary conservation of the 
human linker-scaffold 


Despite low sequence conservation, composite 
structures of the human and S. cerevisiae 
NPCs reveal an identical positioning of the 
scaffold nups, suggesting that the linker- 
scaffold architecture is evolutionarily conserved 
(34-37). Specifically, the human linker-scaffold 
interactions, the topology of scaffold-binding 
regions in the linkers, and the location of linker- 
binding sites in the scaffolds are expected to 
match those of the C. thermophilum nups 
(28, 33, 35, 40, 41). We developed expression 
and purification protocols for recombinant 
human nups (Fig. 6A), enabling systematic 
interaction analyses between scaffold and linker 
nups, for which we generated truncation and 
sequence variants, aided by multispecies se- 
quence alignments (figs. S38, $45, and S47 to 
$49). For a detailed description of these re- 
sults, see the supplementary text (figs. S50 to 
$53). Together with our previous mapping of 
the NUP155“"-NUP98** interaction (35), these 
data establish that the linker-scaffold is evolu- 
tionarily conserved from C. thermophilum to 
humans, including the linker-binding sites in 
the scaffolds and the topology of the scaffold- 
binding regions in the linkers (Fig. 6B). 


Biochemical and structural analysis 
of the human NUP93-NUP53 interaction 


Our dissection of the human linker-scaffold 
interaction network identified an interac- 
tion between NUP93°°" and a NUP53 region 
N-terminal of the RNA recognition motif (RRM)- 
like domain (N) (residues 1 to 169) (fig. S51) 
that was nevertheless devoid of homology to 
the corresponding C. thermophilum Nup53”” 
amphipathic o-helix motif that fits into a hy- 
drophobic groove of the Nic96°°" scaffold 
(figs. S48 and S49) (35). Through fragment 
truncation and five-alanine scanning muta- 
genesis, we identified a NUP53™” region (re- 
sidues 84 to 150) that formed a stable complex 
with NUP93%°", within which residues 86 to 
100 were required for binding to NUP935°" 
(Fig. 6, C and D, and figs. S54 and S55). 

To elucidate the molecular details of binding 
between the divergent NUP53" and NUP93°, 
we determined crystal structures of apo 
NUP93°°" and NUP93°°"*NUP53™ at 2.0- 
and 3.4-A resolution, respectively. As with 
other linker-scaffold interactions, only a core 
region (residues 88 to 95) of the biochemically 
mapped minimal NUP53™ was resolved (Fig. 
6E and fig. S56). C. thermophilum and human 
Nic96°°" orthologs display equivalent o-helical 
solenoid architectures (Fig. 6F) (35, 58, 59). In 
contrast to the C. thermophilum Nup53*? 
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Fig. 5. Functional in vivo dissection of the S. cerevisiae NPC linker-scaffold. 


(A) Cross-sectional view of the S. cerevisiae NPC c 


omposite structure generated by 


docking linker-scaffold structures into an ~25-A in situ subtomogram averaged cryo- 
ET map [Electron Microscopy Data Bank (EMDB) ID EMD-10198] (36) (top). 
Schematic representation of a S. cerevisiae NPC spoke (bottom). (B) Domain 
structure of scNupl16 variants, the Gle2-binding sequence (GLEBS), FG repeats, the 
scNup192-binding region (R1), the scNup188-binding region (R2), and the scNup157/ 


170-binding region (R3). (C) Viability analysis of a 
nup100Anup116Anup145A/NUPI45C strain expressi 
subjected to 5-fluoroorotic acid (5-FOA) selection for 
(D) Subcellular localization at permissive (30°C) and 


0-fold dilution series of a 
ng scNup1l6 variants and 
loss of rescuing wild-type plasmid. 
growth-challenging (37°C) 


temperatures of a representative subset of eGFP-scNup16 variants in a 


nuplOOAnup116Anupl45A/NUPI45C-mCherry strain. 
quantitation (n > 500) of subcellular localization of 6 
scRpl25-mCherry and poly(A)" RNA at 30° and 37°C 
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E) Representative images and 
OS preribosomal export reporter 


in the presence of a representative 
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subset of scNupll6 


variants. (F) Domain structures of scNupl92, scNup188, and 


scNic96 variants. (G) Viability analysis of a 10-fold dilution series of nup192A, 
nupl88Apom34A, and nic96A S. cerevisiae strains expressing scNup192, scNup188, and 
scNic96 variants, respectively, and subjected to 5-FOA selection for loss of rescuing 


wild-type plasmids. 
nuclear scRpl25-m 


scNup188 and scNi 


(H) Representative images and quantitation (n > 500) of the 
Cherry and poly(A)* RNA retention in the presence of 
ic96 variants at indicated growth-challenging temperatures in 


nup188Apom34A and nic96A S. cerevisiae strains, respectively. (1) Subcellular 
localization at permissive (30°C) and growth-challenging (37°C) temperatures of 


the scCNT subunit 


scNup57-eGFP in the presence of mCherry-scNic96 variants. 


(J) Schematic model of CNT positioning in wild-type, scNic96" deletion, and 
GS-linker replacement strains. Squares associated with variant labels are color 
coded according to the nup binding partners targeted by the mutation. All 


experiments were 


performed in triplicate. Mean and associated standard error 


are reported for al 


quantitation. Scale bars are 5 wm. 
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Cytoplasm 


Nucleus 


Movie 2. Architecture of the S. cerevisiae NPC linker-scaffold. An animated dissection of the composite 
structure generated by docking high-resolution crystal and single-particle cryo-EM structures into the 

S. cerevisiae ~25-A NPC cryo-ET map (EMDB ID EMD-10198) (36). The nuclear envelope and protein cryo-ET 
densities are rendered as opaque and transparent gray isosurfaces, respectively. Crystal structures of 
nups and nup complexes are shown in cartoon representation. Unstructured linker connections between 


docked scaffolds are drawn as dashed lines. 


amphipathic o, helix, human NUP53™ binds 
the conserved NUP93°°" hydrophobic groove 
that encompasses o helices a5 and 13 to a5 
as a linear eight-residue motif, burying ~1100 A” 
of combined surface area (Fig. 6, Eand I to K, 
Movie 3, and fig. S56E). Systematic alanine 
substitution of the resolved NUP53™” motif, 
invariant across metazoan NUP53 sequences 
(fig. S49), confirmed the key role of the Pro®?- 
Pro” di-proline and Ile™, consequently also 
illustrating the importance of the NUP93°°™ 
residues that interface with them (Fig. 6, G, H, 
and L, and fig. S57). Together, our data es- 
tablish that despite distinct binding modes 
and low sequence conservation, linker-scaffold 
interactions are evolutionarily conserved, fur- 
ther highlighting their essential role and in- 
dicating that shape conservation of scaffolds is 
a key determinant of the NPC architecture. 


Architecture of the human NPC symmetric core 
Quantitative docking of nup complexes 
in cryo-ET maps of the human NPC 


We have previously demonstrated that nup 
ortholog crystal structures can be success- 
fully used to interpret the density of an ~23-A 
cryo-ET map of the intact human NPC, yield- 
ing a near-atomic composite structure of the 
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NPC symmetric core that included linker- 
scaffold crystal structures of the Nup170¢ 
Nup53*’-Nupl45N*%, Nic96so"-Nup53*”, and 
CNT+Nic96™' complexes (28, 35). The newly 
available structures of full-length Nup192 and 
Nup188 as part of Nup192*Nic96**-Nupl45N""+ 
Nup53™ and Nup188-Nic96*?*Nupl45N” 
linker-scaffold complexes, as well as the hu- 
man NUP93°°"*NUP53"” complex, allowed us 
to build on our previous analysis with an im- 
proved ~12-A cryo-ET map of the intact hu- 
man NPC (provided by Martin Beck’s group) 
(47). As for the S. cerevisiae NPC described 
above, our quantitative docking approach con- 
sisted of statistically scoring the fit of resolution- 
matched densities simulated from crystal and 
single-particle cryo-EM structures that were 
randomly placed and locally refined in cryo- 
ET maps of the human NPC. Structures of 
the CNC, Nup192°Nic96*”eNupl45N™-Nup53™, 
Nup188-Nic96’-Nupl45N™, and NUP358N7? 
(reported in the accompanying manuscript) 
(60) were readily placed in cryo-ET maps of 
the entire NPC or of the inner ring portion. 
Assigned density was then iteratively sub- 
tracted from the maps to reduce the subse- 
quent search space for NUP93°C"-NUP53*2, 
Nup170-Nup53*’-Nupl45N*°, CNTeNic96™, 
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and NUP53""™ (fig. S58). For a detailed de- 
scription of these results, see the supplemen- 
tary text (figs. S58 to S77). 

The structures and improved cryo-ET map 
of the intact human NPC have disambiguated 
the placement of NUP188 and NUP205 hubs in 
the inner ring and distal outer ring positions, led 
to the discovery of a proximal NUP205 in the 
cytoplasmic outer ring, identified NUP93°°™ 
in the outer rings, placed NUP53°®™ homo- 
dimers between inner ring spokes, and revealed 
a comprehensive map of scaffold-bound linker 
segments that implied a single symmetric core 
linker topology connecting linker segments re- 
lated by the shortest Euclidean distance (Movie 
4 and figs. S77 and S78). The composite structure 
includes ~400,000 ordered residues that ex- 
plain nearly all protein density of the symmet- 
ric core and assign the protein identity and 
location of ~64 MDa out of ~110 MDa of the 
human NPC mass. 


Architecture of cytoplasmic and nuclear 
outer rings 


In both the cytoplasmic and nuclear outer 
rings of the human NPC, 16 copies of the Y- 
shaped CNC are arranged in two concentric 
proximal and distal rings. At equivalent lo- 
cations on both the cytoplasmic and nuclear 
sides, eight copies of NUP205 are interca- 
lated between the proximal NUP75 arms and 
distal NUP107 stalks of CNCs from adjacent 
spokes. Eight NUP93°°" copies are inserted 
between the distal NUP107 and proximal 
NUP96 o-helical solenoids, bisecting the stalks 
of tandem-arranged CNCs of a single spoke 
(Fig. 7 and fig. S79). Stretched out, the ~25- 
residue unstructured linker connecting the 
R2 and SOL regions of NUP93 bridge the 
~95-A gap between the distal NUP205-bound 
NUP93* and the distal NUP93°™" of an adja- 
cent spoke, thus cross-linking the outer ring 
spokes (Fig. 7, Movie 4, and figs. S78 to S80). 
Compared to the Nup192 ortholog, NUP205 
presents an additional ~240 residues that elon- 
gate the C-terminal Tail region, suggesting that 
~95 A is an upper estimate for the distance 
between distal NUP93™ and NUP93°°" from 
adjacent spokes. 

Specific to the cytoplasmic face, an addi- 
tional eight copies of the proximal NUP205 
are lodged between the NUP75 arm of the 
proximal CNC and the bridge NUP155 that 
connects the outer and inner ring (Fig. 7A 
and fig. S79). The proximal NUP205-bound 
NUP93™ can only be linked with a proximal 
NUP93°°" of the same spoke (Fig. 7A and figs. 
S79 and S80A). Furthermore, the arrangement 
of NUP53 binding sites on NUP93°°! and 
NUP205 copies in the cytoplasmic outer rings 
is compatible with the NUP53-mediated link- 
age of the distal NUP205 with the proximal 
NUP93°°" and, conversely, the proximal 
NUP205 with the distal NUP93%°" (Fig. 7 and 
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Fig. 6. Evolutionary conservation of the human linker-scaffold network. 

(A) Domain structures of the human inner ring nups. NUP93 consists of linker 
(residues 1 to 173) and scaffold (residues 174 to 819) regions. (B) Schematic 
summary of the linker-scaffold interactions in complexes organized around the 
NUP188 and NUP205 scaffold hubs. Black lines connecting colored bars indicate 
interactions between nup regions. (©) Summary of SEC binding analysis identifying 
the minimal NUP53" region (red) sufficient for NUP93°° binding. +++, no effect; 
++, weak effect; +, moderate effect; -, abolished binding. (D) Effect of each five- 
alanine substitution on SUMO-NUP53" binding to NUP93S°, as assessed by SEC and 
indicated by colored boxes above NUP53 primary sequence. (E) Cartoon 
representations of the 2.0-A H. sapiens apo NUP93S°, 3.4-A H. sapiens 
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NUP93S°*NUP53*", and 2.7-A C. thermophilum Nic96°°“+Nup53°* (PDB ID 5HB3) 
(35) crystal structures and their superposition. An ~12° displacement of the 
C-terminal a-helical solenoid, pivoted about the hinge loop, is observed between the 
apo NUP93°° and NUP93°°'+NUP53* structures. (F) Schematic of the human 
NUP93S° and C. thermophilum Nic96S°- fold architectures. (G and H) Summary of 
the effect of structure-guided mutations in (G) SUMO-NUP53% and (H) NUP93°° 
on NUP938°'*SUMO-NUP53" complex formation, as assayed by SEC. (I to K) Magnified 
views of the regions indicated with insets in (E), which compare molecular details of the 
NUP53* and Nup53* binding sites. (L) SEC analyses of NUP939°+*SUMO-NUP53" 
complex formation and disruption by mutants. SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) gel strips of peak fractions visualized by Coomassie staining. 
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Movie 3. Structural analysis of the NUP93S°-NUP53™ structure. Comparison of cartoon representa- 
tions of H. sapiens NUP93°°‘+NUP53"* with S. cerevisiae Nic96°°t (PDB ID 2QX5) (59) and C. thermophilum 
Nic96°°t+Nup53"* (PDB ID 5HB3) (35) orthologs and comparison of conformational differences between 
apo NUP93S°! and NUP93S°'*NUP53** obtained from different crystal forms. 


Cytoplasm AG ON 


Nucleus 


Movie 4. Architecture of the symmetric core of the human NPC. An animated dissection of the composite 
structure generated by quantitatively docking high-resolution crystal and single-particle cryo-EM structures into 
the human ~12-A NPC cryo-ET map (EMDB ID EMD-14322) (47). The nuclear envelope and protein cryo-ET 
densities are rendered as opaque and transparent gray isosurfaces, respectively. Crystal structures of nups and 
nup complexes are shown in cartoon representation. Unstructured linker connections between docked scaffolds 
are drawn as dashed lines. 
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figs. S79 and S80, D and E). NUP53 could also 
mediate long-range links between the bridge 
NUP155 and the R1 and R2 binding sites on 
outer ring NUP205 and NUP93°", respectively 
(fig. S80, F, H, and I). 

On both the nuclear and cytoplasmic sides, 
the NUP98“"?-binding NUP96 sites present in 
the 16 CNC copies recruit 16 copies of NUP98 
that can simultaneously satisfy the outer ring 
NUP205 binding sites because of the ~115- 
residue linker between the autoproteolytic 
domain (APD) and the R1-R2-R3 regions 
(Fig. 7 and fig. S79). The cytoplasmic proximal 
NUP205 and bridge NUP155 scaffolds could, 
in principle, be linked by NUP98, although 
NUP98°* is likely outcompeted from its bridge 
NUPI155 binding site by the asymmetric cyto- 
plasmic filament nups GLE1*-NUP42 (61), as 
explained in the accompanying manuscript 
(Fig. 7A and fig. S80C) (60). To maximize 
nup copy parsimony while satisfying all avail- 
able scaffold binding sites, the outer rings 
would recruit 16 copies of NUP53 and NUP98 
on each side of the NPC. 


Architecture of the inner 
ring linker-scaffold 


The quantitative docking confirmed evolu- 
tionary conservation of the inner ring linker- 
scaffold architecture between Homo sapiens 
and S. cerevisiae NPCs (Fig. 8). A NUP155*NUP53 
linker-scaffold coats the nuclear envelope, 
anchored by membrane curvature-sensing 
amphipathic lipid packing sensor (ALPS) mo- 
tifs and the C-terminal NUP53 amphipathic 
helix (35, 62-64), with a peripheral and equa- 
torial copy on each side of a spoke midplane. 
The homodimerizing NUP53"™™ domains link 
spoke halves across the midplane (Fig. 8C and 
figs. S81 and S82, A and B). A second cross- 
midplane link between NUP53 and NUP93 
connects NUP188*NUP93°CNT and NUP205¢ 
NUP93°CNT modules to the NUP155 coat at 
equatorial and peripheral positions, respec- 
tively (Fig. 8, D to F, and fig. S82, C to J). 
Restrained by their length, NUP93 N-terminal 
linkers connect NUP188 with the peripheral 
and NUP205 with the equatorial inner ring 
NuP93°°" and CNT copies of the same inner 
ring spoke (Fig. 8, D to F, and fig. S82, G to J). 

The equatorial position of NUP205 is fur- 
ther solidified by a NUP98-mediated linkage 
with the equatorial NUP155 and by a NUP53- 
mediated linkage with a peripheral NUP93 
from an adjacent spoke (Fig. 8E and fig. S82, 
E and F). As in the S. cerevisiae NPC, the 
NUP205, NUP188, and peripheral NUP155 
binding sites for the NUP98 R1, R2, and R3 
regions, respectively, are too far apart to be 
linked by the same NUP98, suggesting that 
three NUP98 copies are required to satisfy all 
binding sites on each side of a spoke midplane 
instead. The NUP98“"?-binding NUP96 and 
cytoplasmic filament NUP887” (placed in 
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(proximal) 


NUP155 
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OICNC (distal) GEINUP205 (distal) 


Fig. 7. Architecture of the human NPC symmetric core outer rings. (A and B) 
Composite structure generated by quantitatively docking crystal and single-particle 
cryo-EM structures into an ~12-A cryo-ET map of the intact human NPC (EMDB ID 
EMD-14322) (47) viewed from the (A) cytoplasmic and (B) nuclear face. Nuclear 
envelope and docked structures are rendered in isosurface and cartoon representation, 


the accompanying manuscript) (60) sites are 
within reach of the ~115-residue linker be- 
tween the APD and the R1-R2-R3 regions, thus 
linking the inner ring with the outer ring and 
cytoplasmic asymmetric portions of the NPC. 
The inner ring scaffold architecture is in- 
terwoven by linker interactions (Movie 4). The 
peripheral NUP155-NUP188-NUP93-CNT and 
equatorial NUP155-NUP205-NUP93-CNT scaf- 
fold modules that together form a protomer 
for a D8-symmetric inner ring can themselves 
be superposed, if NUP188 and NUP205 are con- 
sidered as equivalent organizing hubs (Fig. 8G). 
Nevertheless, the peripheral NUP155 is not 
linked to the peripheral NUP188*NUP93*CNT 
complex from the same side of a spoke mid- 
plane. Instead, within each spoke, linker-mediated 
complexes form between a peripheral NUP188+ 
NUP98eNUP93eCNT*NUP53 and an equatorial 
NUPI155 from across the midplane (fig. S82K). 
On the contrary, the equatorial NUP205*NUP98+ 
NUP93*CNT*NUP53 complex is linked with 
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GINupg3 (distal) 
CNC (proximal) [EINUP205 (proximal) II NUP93 (proximal) LINUP98 
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a peripheral NUP155 from the same and an 
equatorial NUP155 from the opposite side of a 
spoke midplane, through NUP98- and NUP53- 
mediated links, respectively (fig. S82K). 

To maximize nup copy parsimony while 
satisfying all available scaffold binding sites, 
the human NPC would recruit a total 56 and 
80 copies of NUP53 and NUP98, respectively. 
Though the rest of the symmetric core com- 
posite structure agrees with previous esti- 
mates of nup stoichiometry, the implied NUP98 
and NUP53 copy number exceeds the empir- 
ical measurements (65). Whereas the discrep- 
ancy may be explained by available NUP98 
and NUP53 binding sites not being fully 
occupied, the NUP98 and NUP353 linker nups 
are known to exchange comparatively rapidly 
at the NPC (66, 67) and might get depleted as 
part of the preparation for mass spectrometric 
analysis. 

Docking the composite structure of the NPC 
into a ~37-A in situ cryo-ET map of the dilated 
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ICNC (proximal) HI NUP93 (distal) 


Nuclear outer ring 


JUP4 


Ni 55 
(bridge) 


GENUP205 (distal) BENUPS3 [ENUP155 (bridge) 


[INUP98 {"}Not present in nuclear outer ring 


respectively. Insets indicate regions encompassing two spokes (top), 90° rotated and 
magnified (middle), and schematized (bottom). Cross-spoke distances between the distal 
NUP205-bound NUP93* and distal NUP9B3°°- are indicated in red. Linker binding sites 
on scaffold nup surfaces are indicated by colored circles. Dashed transparent shapes 
indicate the absence of proximal NUP205 and NUP93 from the nuclear outer ring. 


human NPC revealed that the inner ring spokes 
move as relatively rigid bodies, accommo- 
dating the dilation by increasing the NUP53 
linker-bridged gaps between spokes (fig. S83, 
A and B) (45). Spatial restraints in the dilated 
NPC confirm the intraspoke topology of link- 
ages established by N-terminal NUP93 and 
NUP98 linkers. Notably, the dilation of the 
NPC does not induce a substantial increase in 
the gap between the adjacent spokes of the 
outer rings, consistent with the cross-linking 
purported by the linker between distal NUP93 
R2 and SOL regions of adjacent spokes (Movie 
5 and fig. S83B). 


Conclusions 


The linker-scaffold is a fundamental architec- 
tural principle of the NPC structure. Despite 
the continuing improvement in resolution 
attained by cryo-ET reconstructions of intact 
NPCs over the past decade, the fine molecular 
detail of the linker-scaffold has remained 
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Fig. 8. Linker-scaffold architecture of the human NPC inner ring. (A and B) 
Composite structure generated by quantitatively docking crystal and single- 
particle cryo-EM structures into an ~12-A cryo-ET map of the intact human NPC 
(EMDB ID EMD-14322) (47) viewed from (A) the cytoplasmic face and (B) the 
central transport channel cross-section. Nuclear envelope and docked structures 
are rendered in isosurface and cartoon representation, respectively. (C to F) 
Starting from the nuclear envelope, successive layers reveal the architecture of 
three inner ring spokes of the human NPC. Corresponding schematics illustrate 
linker paths between binding sites on scaffold surfaces (colored circles). 
NUP53""™ domains (C) homodimerize between spokes to link cytoplasmic 
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(cnc Mnup1s55 fNuPi88 GBNUP205 [CNT [NUP98 [INUPS3 
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Nuclear 
envelope 


torial) [ECNT(peripheral) 


Superposition 


peripheral with nuclear equatorial, and conversely cytoplasmic equatorial with 
nuclear peripheral copies of NUP155. NUP53°"™ domains (D) link cytoplasmic 
peripheral with nuclear equatorial and, conversely, cytoplasmic equatorial 

with nuclear peripheral copies of NUP93°°+. NUP205 and NUP188 (E) bind to the 
equatorial and peripheral NUP93®"2, respectively. NUP98 connects NUP205 and 
equatorial NUP155. NUP53 connects NUP205, and cross-spoke peripheral 
NUP938° (F) CNT is recruited by NUP93® and positioned by NUP93" binding 
to NUP188 and NUP205. (G) Close-up views of inner ring modules assembled 
around NUP188 and NUP205 scaffold hubs and their superposition. Dashed lines 
indicate unstructured linker nup segments and FG-repeat regions. 
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(constricted) 


In situ human NPC 
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~ 1,250 A 


Movie 5. Dilation and constriction of the human NPC. Interpolated transition between near-atomic 
composite structures of the symmetric core in the constricted state of the ~12-A NPC cryo-ET map (EMDB ID 
EMD-14322) (47) obtained from purified nuclear envelopes and the symmetric core in the dilated state 
observed in the ~37-A in situ cryo-ET map (EMDB ID EMD-11967) (45) of the human NPC. Enabled by 
linker-scaffold plasticity, the outward motion of the relatively rigid inner ring spokes enlarges the central 
transport channel and generates lateral channels between spokes. 


out of reach. We used comprehensive residue- 
level biochemical reconstitution and mapping, 
crystallographic and single-particle cryo-EM 
structure determination, in vivo validation, 
and quantitative docking into improved and 
diversified cryo-ET NPC reconstructions to 
delineate the near-atomic structure and evo- 
lutionary conservation of the linker-scaffold 
interactions that underpin the integrity of 
the NPC. 

This study completes the set of structures 
capturing all biochemically tractable linker- 
scaffold interactions of the symmetric core. 
Docked into cryo-ET maps of the human and 
S. cerevisiae NPC, they reveal the topology and 
restrain distances between linker binding sites 
on scaffold surfaces, outlining how the multi- 
valent linkers Nup145N/NUP98, Nup53/NUP53 
and the N-terminal region of Nic96/NUP93 
connect different parts of the NPC. Linkers 
mediate the formation of inner ring com- 
plexes that coalesce into relatively rigid spokes 
spanning from the nuclear envelope to the 
central transport channel. They also cross- 
stitch the inner ring scaffolds with connec- 
tions across spoke midplanes and flexible 
links between spokes. In the outer rings, linker- 
scaffold interactions connect spokes and project 
ties toward the inner ring. 

Our biochemical analysis of linker-scaffold 
interactions involving Nup145N and Nup53 
revealed another architectural principle, invi- 
sible to structural methods: Linker-scaffold 
interactions are driven by structurally defined 
anchor motifs that present canonical two- 
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component binding dynamics but are poten- 
tiated by disperse, structurally elusive interac- 
tions between flanking residues and promiscuous 
binding sites on scaffold surfaces. These Velcro- 
like binding modes, sometimes referred to as 
“fuzzy interactions,” are found in systems in- 
volving intrinsically disordered proteins across 
biology, a prominent example being the ultra- 
fast exchange of nucleocytoplasmic transport 
receptors on FG repeats (72). 

The physical and chemical properties of 
linkers are advantageous for the assembly of 
giant complexes like the NPC. The unfolded 
property of linkers enables long-range inter- 
actions and confers flexibility that can accom- 
modate large movements or shock-absorb 
nuclear envelope deformations. The disperse 
nature of linker-scaffold interactions is con- 
ducive to the reuse of linker interactions in 
different chemical and steric environments of 
the NPC. The ensemble of binding modes pro- 
vides robustness in the face of conformational 
changes of the NPC that might otherwise be 
incompatible with a singular binding mode. 
The bulk of FG repeats present in the central 
transport channel, which form promiscuous 
transient interactions with the inner ring 
scaffold and other parts of the NPC, is likely 
to have a similar effect. Specifically, the pre- 
vious findings that FG repeats not only form the 
NPC's diffusion barrier but also interact with 
scaffolds supports this notion (49, 52, 58, 68, 69). 
Considering the avidity that results from 
multiple scaffold valences per linker, and 
the allovalency mediated by flanking resi- 
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dues (70, 71), it is unsurprising that our in vivo 
perturbation of interactions required exten- 
sive mutations to exacerbate deleterious pheno- 
types. For this reason and because of the lack 
of constraints imposed by a protein fold, new 
linker sequences are readily evolvable. Nota- 
bly, the linker-scaffold network topology and 
modular binding site distribution on linkers is 
conserved from fungi to humans, despite con- 
siderable divergence in linker sequences, most 
extremely exemplified by the complete diver- 
gence of the Nup53/NUP53 motif that binds to 
a conserved site on Nic96/NUP93. 

The binding of linkers is amenable to ex- 
change and regulation. The linearity of link- 
ers imposes few obstacles to the deposition of 
posttranslational modifications by the same 
machinery along the entire sequence to rapidly 
ablate the multiple binding valences. Indeed, 
patterns of Cdk1 and Nek-driven phosphoryl- 
ation that lead to the choreographed deple- 
tion of both NUP98 and NUP53 from the NPC 
during mitotic nuclear envelope breakdown in- 
clude the R1, R2, and R3 regions of both linkers 
(72, 73). Structural defects in the NPC resulting 
in aberrant nucleocytoplasmic transport may 
affect gene expression, mRNA maturation, 
and mRNA export, leading to downstream 
tumorigenic processes. Therefore, the deple- 
tion of NUP98 from the NPC as a result of 
gene fusion mutations associated with vari- 
ous hematopoietic malignancies should be 
considered in the study of the carcinogenic 
mechanisms triggered by these mutations (74). 

The unexpected discovery of the presence 
and distinctive role of NUP93 in cross-linking 
the outer ring spokes of the human NPC, 
along with its organizing role in the inner 
ring as both scaffold and linker, exemplifies 
the reuse of linker-scaffold functional units at 
completely different locations of the NPC. Its 
ubiquity rationalizes the observation that rapid 
degron-induced depletion of NUP93 leads 
to the concomitant loss of both inner and 
outer rings from the nuclear pore, legitimizing 
NUP93 as a “lynchpin” of the NPC (75). These 
findings further inform the mechanistic basis 
for pathologies like steroid-resistant nephrotic 
syndrome (SRNS), which is associated with 
mutations in NUP93 and NUP205, the most 
poignant of which is NUP205 Phe’?®°—Ser 
(F1995S), located in the NUP93" binding site 
of NUP205 and shown to abolish the NUP205- 
NUP93 interaction (76). Nonsense mutations 
that omit the NUP93™ binding site in NUP188 
are also associated with neurologic, ocular, and 
cardiac abnormalities (77). 

The docking of our NPC composite struc- 
ture into an ~37-A in situ cryo-ET map of the 
dilated human NPC demonstrates the magni- 
tude of the movements that must be withstood 
by the linkers that connect the inner ring spokes 
(45). Importantly, the dilated inner ring reveals 
lateral channels between its spokes that can 
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accommodate the passage of small cytosolic 
domains of inner nuclear membrane inte- 
gral membrane proteins (INM-IMPs), sug- 
gesting the mechanism by which inner ring 
constriction upon energy depletion might 
interfere with the path of diffusion of these 
proteins between outer and inner nuclear 
envelope membranes (78-80). It remains to 
be established whether the extent of inner ring 
dilation observed by current cryo-ET recon- 
structions of the human and S. cerevisiae 
NPCs capture the maximally achievable lateral 
channel dimensions. However, the previous 
observations that karyopherin-mediated ac- 
tive INM-IMP transport requires unstructured 
tethers spanning the distance between the 
nuclear envelope and the central transport 
channel is consistent with the observed lateral 
channel dimensions (fig. S83C) (87-85). Tak- 
ing advantage of the composite NPC struc- 
tures, future studies are expected to elucidate 
the nucleocytoplasmic translocation pathways 
and the impact of inner ring dilation for in- 
dividual INM-IMPs. Similarly, the transloca- 
tion of perinuclear domains on the opposite 
side of the nuclear envelope is expected to be 
limited by the luminal ring that encircles the 
NPC midplane formed by Pom152/POM210 
Ig-like domains (37, 47, 86). 

Finally, the capacity of the NPC to exist in 
dilated and constricted states dependent on 
tension imparted by the surrounding nuclear 
envelope portrays the NPC as the cell’s largest 
mechanosensitive channel, with implications 
in cellular energy-state sensing and transport 
of transcriptional regulators in response to 
mechanical stress on the cell (46, 87). The con- 
striction and dilation of the inner ring may not 
only affect the distribution and local concen- 
tration of FG repeats in the central transport 
channel, a determinant of karyopherin-mediated 
transport efficiency (88) but also sterically mod- 
ulate the flux of INM-IMPs and large cargos 
such as preribosome or messenger ribonucleo- 
protein particles (MRNPs). Future studies will 
have to establish the causal links between the 
transmission of tension from the nuclear envel- 
ope to the NPC, the dilation of its inner ring, and 
mechanisms by which dilation and constriction 
may modulate nucleocytoplasmic transport. 

Our results illuminate the elusive linker- 
scaffold molecular interactions that main- 
tain the integrity of the NPC, providing a 
comprehensive characterization of a final 
major aspect of the NPC symmetric core. Build- 
ing on this roadmap, future studies can address 
NPC assembly and disassembly mechanisms, 
the emergence of NPC polarity and attachment 
of the asymmetric cytoplasmic filaments and 
nuclear basket to the symmetric core, mecha- 
nisms of NPC-associated diseases, and mecha- 
nisms of NPC dilation and constriction along 
with their implications in nucleocytoplasmic 
transport. 
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Methods summary 

Comprehensive materials and methods are 
presented in the supplementary materials. 
Briefly, the source of materials and reagents is 
listed in table S1. Summaries of bacterial ex- 
pression constructs and conditions (table S2), 
protein purification procedures (table S3), an- 
alytical SEC-MALS protein interaction analyses 
(table $4), and ITC binding affinity measure- 
ments (table S5) are provided. Experimental 
details of x-ray crystallography and single- 
particle cryo-EM structure determination pro- 
cedures are described, including summaries of 
crystallization and cryo-protection conditions 
(table S6), as well as data collection, process- 
ing, and refinement statistics (tables S7 to S12). 
S. cerevisiae constructs (table S13) and strains 
(table S14), as well as the experimental details 
of the viability and growth assay, subcellular 
nup localization analysis, 60S preribosome 
export assay, and mRNA export fluorescence 
in situ hybridization (FISH) assay, which es- 
tablish the physiological relevance of the bio- 
chemical and structural findings, are provided. 
Details of the incremental quantitative dock- 
ing procedures for nup and nup complex 
crystal and single-particle cryo-EM structures 
into ~12- and ~23-A cryo-ET maps of the intact 
human NPC (constricted state) (38, 47), as well 
as into an ~37-A in situ cryo-ET map of the 
human NPC (45) and an ~25-A in situ map of 
the S. cerevisiae NPC (36) (dilated states), are 
provided. Inventories of nup and nup complex 
experimental structures used to generate 
the near-atomic composite structures of the 
S. cerevisiae (table S15) and human (table 
S16) NPCs are provided. 
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Quantum technology promises to revolutionize how we learn about the physical world. An experiment 
that processes quantum data with a quantum computer could have substantial advantages over 
conventional experiments in which quantum states are measured and outcomes are processed with a 
classical computer. We proved that quantum machines could learn from exponentially fewer experiments 
than the number required by conventional experiments. This exponential advantage is shown for 
predicting properties of physical systems, performing quantum principal component analysis, and 
learning about physical dynamics. Furthermore, the quantum resources needed for achieving an exponential 
advantage are quite modest in some cases. Conducting experiments with 40 superconducting qubits 
and 1300 quantum gates, we demonstrated that a substantial quantum advantage is possible with 


today’s quantum processors. 


umans learn about nature through ex- 

periments, but until now our ability to 

acquire knowledge has been hindered 

by viewing the quantum world through 

a classical lens. The rapid advancement 
of quantum technology portends an opportu- 
nity to observe the world in a fundamentally 
different and more powerful way. Instead of 
measuring physical systems and then process- 
ing the classical measurement outcomes to 
infer properties of those physical systems, 
quantum sensors (7) will eventually be able to 
transduce (2) quantum information in physi- 
cal systems directly to a quantum memory (3, 4), 
in which it can be processed by a quantum 
computer. Figure 1A illustrates the distinction 
between conventional and quantum-enhanced 
experiments. For example, in a quantum- 
enhanced experiment, multiple photons might 
be captured and stored coherently at each 
node of a quantum network and then pro- 
cessed coherently to extract an informative 
signal (5, 6, 7). In both the conventional and 
quantum-enhanced settings, multiple copies 
of the same quantum state are acquired. The 
crucial distinction is that the copies are mea- 
sured one at a time in conventional experi- 
ments whereas entangling measurements 
across multiple copies are allowed in quantum- 
enhanced experiments. 
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Recent mathematical analyses performed 
by some of the authors show that there exist 
properties of an 7-qubit system that a quan- 
tum machine can learn efficiently whereas the 
requisite number of conventional experiments 
to achieve the same task is exponential in 
n (8, 9). This exponential advantage contrasts 
sharply with the quadratic advantage achieved 
in many previously proposed strategies for 
improving sensing using quantum technology 
(1). In this article, we propose and analyze 
three classes of learning tasks with exponen- 
tial quantum advantage and report on proof- 
of-principle experiments using up to 40 qubits 
on a Google Sycamore processor (10). These 
experiments confirm that a substantial quan- 
tum advantage can be realized even when the 
quantum memory and processor are both noisy. 

To be more concrete, suppose that each 
experiment generates an n-qubit state p, and 
our goal is to learn some property of p (Fig. 1). 
We depict conventional and quantum-enhanced 
experiments for this scenario in Fig. 1B. In 
conventional experiments, each copy of p is 
measured separately, the measurement data 
are stored in a classical memory, and a clas- 
sical computer outputs a prediction for the 
property after processing the classical data. 
In quantum-enhanced experiments, each copy 
of p is stored in a quantum memory, after 
which the quantum machine outputs the pre- 
diction after processing the quantum data in 
the quantum memory. We proved that for some 
tasks, the number of experiments needed to 
learn a desired property is exponential in 
n with the conventional strategy, but only 
polynomial in 2 using the quantum-enhanced 
strategy. For suitably defined tasks, we could 
achieve exponential quantum advantage using 
a protocol as simple as storing two copies of p 
in quantum memory and performing an en- 
tangling measurement. We also showed that 
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quantum-enhanced experiments have a simi- 
lar exponential advantage in a related scenario 
shown in Fig. 1C, in which the goal is to learn 
about a quantum process € rather than a quan- 
tum state p. Advantages of entangling measure- 
ments over single-copy measurements have 
been noticed previously (1, 12), but our work 
goes much further by establishing an advan- 
tage that scales exponentially with system size. 

Building on previous observations (8, 13), 
we proved that for a task that entails ac- 
quiring information about a large number 
of noncommuting observables, quantum- 
enhanced experiments could have an expo- 
nential advantage even when the measured 
quantum state is unentangled. Our work sub- 
stantially reduces the complexity of the required 
quantum-enhanced experiments, improving 
the prospects for near-term implementation. 
By performing experiments with up to 40 
superconducting qubits, we showed that this 
quantum advantage persisted even when 
using currently available quantum proces- 
sors. We also demonstrated quantum advan- 
tage in learning the symmetry class of a 
physical evolution operator, inspired by re- 
cent theoretical advances (9, 13). Finally, in 
a theoretical contribution we rigorously proved 
that quantum-enhanced experiments have an 
exponential advantage in learning about the 
principal component of a noisy state, as pre- 
viously indicated (14). 

In our proof-of-principle experiments, we 
directly executed the state preparation or pro- 
cess to be learned within the quantum proces- 
sor. In an actual application, the quantum 
data analyzed by the learning algorithm might 
be produced by an analog quantum simulator 
or a gate-based quantum computer. We also 
envision future applications in which quan- 
tum sensors equipped with quantum proces- 
sors interact coherently with the physical world. 
The robustness of quantum advantage with 
respect to noise—validated by our experiments 
using a noisy superconducting device—boosts 
our confidence that the quantum-enhanced 
strategies described here can be exploited 
someday to achieve a substantial advantage 
in realistic applications. 


Provable quantum advantage 


We present three classes of learning tasks and 
the associated quantum-enhanced experiments, 
each yielding a provable exponential advantage 
over conventional experiments. Each result 
is encapsulated by a theorem which we state 
informally. Precise statements and proofs are 
presented in the supplementary materials. 
Our experimental demonstrations are dis- 
cussed below in the section titled Demon- 
strations of Quantum Advantage. The proofs 
proceed by representing a classical algorithm 
with a decision tree depicted at the center of 
the gray robot in Fig. 1. The tree representation 
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Fig. 1. Illustration of quantum-enhanced and conventional experiments. 

(A) Quantum-enhanced experiments versus conventional experiments. Quantum- 
enhanced or conventional experiments interface with a quantum or classical 
machine running a quantum or classical learning algorithm that can store 

and process quantum or classical information. (B) Learning physical state p. 
Each experiment produces a physical state p. In the conventional setting, we 
measure each p to obtain classical data (the measurement could depend 

on prior measurement outcomes) and store the data in a classical memory. In 
the quantum-enhanced setting, p can coherently alter the quantum information 
stored in the memory of the quantum machine (illustrated by the change in 


encodes how the classical memory changes as 
we obtain more experimental data. We then 
analyzed how the transitions on the tree differ 
for distinct measured physical systems to pro- 
vide rigorous information-theoretic lower 
bounds. A general mathematical framework 
building on (73) is given in supplementary 
materials, section C. 

The first task concerns learning about a 
physical system described by an n-qubit state, 
p. We suppose that each experiment generates 
one copy of p. In the conventional setting, we 
measure each copy ofp to obtain classical data. 
The procedure can be adaptive, that is, each 
measurement can depend on the data ob- 
tained in earlier measurements. In the quantum- 
enhanced setting, a quantum computer can 
store each copy of p in a quantum memory 
and act jointly on multiple copies of p. In 
both scenarios we require all quantum data 
to be measured at the end of the learning 
phase of the procedure so that only classical 
data survives. After the learning is completed 
the learner is asked to provide an accurate 
prediction for the expectation value of one 
observable drawn from a set {O,,Ob»,...}, 
where the number of observables in the set 
is exponentially large in n. The observables 
in the set can be highly incompatible, that is, 
each observable may fail to commute with 
many others in the set. 

In prior work (8, 73), we required the learn- 
er to predict exponentially many observables, 
which is not possible in practice if the system 
size is large. To demonstrate the advantage 
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in an actual device, we proved that predict- 
ing just the absolute value of one observable 
requires exponentially many copies in the con- 
ventional scenario. By contrast, predicting the 
entire set of observables can be achieved with a 
polynomial number of copies in the quantum- 
enhanced scenario. We thereby established the 
following constant versus exponential separa- 
tion. The proof is given in supplementary mate- 
rials, section D. 


Theorem 1 (Predicting observables): There 
exists a distribution over n-qubit states and 
a set of observables such that in the conven- 
tional scenario, at least order 2” experiments 
are needed to predict the absolute value of one 
observable selected from the set, whereas a 
constant number of experiments suffice in the 
quantum-enhanced scenario. 

The exponential quantum advantage can 
occur even if the state pis unentangled. For 
example, in our experiments we consider 
p&(I + oP), in which P is an n-qubit Pauli 
operator and a<(—1, 1). This state can be real- 
ized as a probabilistic ensemble of product 
states, each of which is an eigenstate of 
P with eigenvalue a. Even if the state is known 
to be of this form but P and a are unknown, the 
exponential separation between conventional 
and quantum-enhanced experiments persists. 
Moreover, the quantum advantage can be 
achieved by performing simple entangling 
measurements on pairs of copies of p. That 
the quantum advantage applies even when 
correlations among the n qubits are classical 
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simply store each copy of p. After multiple rounds of experiments, quantum 
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(C) Learning physical process €. 
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leads us to believe that the quantum-enhanced 
strategy will be beneficial in a broad class of 
sensing applications. In supplementary mate- 
rials section G we extend this theorem, show- 
ing that a sufficiently large quantum memory 
is needed to achieve this task in the quantum- 
enhanced scenario. 

Our second ML task with a quantum ad- 
vantage is quantum principal component 
analysis (PCA) (74). In this task each exper- 
iment produces one copy of p, and our goal is 
to predict properties of the (first) principal 
component of p, namely the eigenstate |y) of 
p with the largest eigenvalue. For example, 
we may want to predict the expectation values 
of a few observables in the state |y). This task 
may become a valuable component of future 
quantum-sensing applications. If an imperfect 
quantum sensor transduces a detected quan- 
tum state into quantum memory, the state is 
likely to be corrupted by noise. But it is 
reasonable to expect that properties of the 
principal component are relatively robust with 
respect to noise (5) and therefore highly in- 
formative about the uncorrupted state. To per- 
form quantum PCA, a learning algorithm was 
introduced in (/4) on the basis of phase esti- 
mation, which requires fault-tolerant quantum 
computers. One can also obtain information 
about the principal component of p by using 
more near-term algorithms, such as virtual 
cooling (16), virtual distillation (17, 18), and 
variational algorithms (19, 20). 

Although the quantum PCA algorithm in (4) 
is exponentially faster than known algorithms 
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based on conventional experiments, this ad- 
vantage was not proven against all possible 
algorithms in the conventional scenario. We 
rigorously established the exponential quan- 
tum advantage for performing quantum PCA. 
The exponential quantum advantage also holds 
in some of the near-term proposals (16, 17). The 
proofs are provided in supplementary mate- 
rials section E. 


Theorem 2 (Performing quantum PCA): In 
the conventional scenario, at least order ue 
experiments are needed to learn a fixed prop- 
erty of the principal component of an unknown 
n-qubit quantum state, whereas a constant 
number of experiments will suffice in the 
quantum-enhanced scenario. 

It is worth commenting on recent results in 
(21, 22) showing that quantum PCA can be 
achieved by polynomial-time classical algo- 
rithms, which may seem to contradict Theo- 
rem 2. Those works assume the ability to 
access any entry of the exponentially large 
matrix p to exponentially high precision in 
polynomial time. Achieving such high preci- 
sion requires measuring exponentially many 
copies of p, which takes an exponential num- 
ber of experiments and exponential time. 
Hence, the assumptions of (27, 22) do not hold 
here. See (23), which provides a detailed expo- 
sition of these matters. 

Another core task in quantum mechanics is 
understanding physical processes rather than 
states. Here, each experiment implements a 
physical process €, and we can interface with 
€ through a quantum or classical machine in 
the quantum-enhanced or conventional set- 
ting; see Fig. 1C. We showed that a quantum 
machine can learn an approximate model of 
any polynomial-time quantum process € from 
only a polynomial number of experiments. 
Given a distribution on input states, the ap- 
proximate model can predict the output state 
from € accurately on average. By contrast, we 
would need an exponential number of ex- 
periments to achieve the same task in the 
conventional setting. The proof for general 
quantum processes is given in supplementary 
materials, section F. 


Theorem 3 (Learning quantum processes): 
Suppose we are given a polynomial-time phys- 
ical process € acting on n qubits and a prob- 
ability distribution over n-qubit input states. 
In the conventional scenario, at least order 2” 
experiments are needed to learn an approx- 
imate model of € that predicts output states 
accurately on average, whereas a polynomial 
number of experiments will suffice in the 
quantum-enhanced scenario. 


Demonstrations of quantum advantage 


The exponential quantum advantage captured 
by Theorems 1, 2, and 3 applies no matter how 
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much classical processing power is leveraged 
in the conventional experiments. The conven- 
tional strategy fails because there is simply no 
way to access enough classical data to perform 
the specified tasks if the number of experi- 
ments is subexponential in 2. However, these 
exponential separations apply in an idealized 
setting in which quantum states are stored 
and processed perfectly. This leads us to ask 
whether access to quantum memory unlocks 
a substantial quantum advantage under more 
realistic conditions. 

For two different tasks, we have investi- 
gated the robustness of the quantum ad- 
vantage by conducting experiments with a 
superconducting quantum processor. We 
consider specialized tasks that maintain ex- 
ponential quantum advantage and have bet- 
ter noise robustness than the general tasks 
described in the previous section. The first 
task we studied pertains to Theorem 1. The 
task is to approximately estimate the mag- 
nitude for the expectation value of Pauli ob- 
servables. The unknown state is an unentangled 
n-qubit state p = 2-"(I + oP), in which o = 
+0.95, P is a Pauli operator, and both o and 
P are unknown. After all measurements are 
completed and learning is terminated, two 
distinct Pauli operators, Q; and Qs, are an- 
nounced, one of which is P and the other of 
which is not equal to P. We then ask the 
machine to determine which of |tr(Qip)| and 
|tr(Qop)| is larger. 

In the conventional scenario in which cop- 
ies of p are measured one by one, the best 
known strategy is to use randomized Clifford 
measurements requiring an exponential num- 
ber of copies to achieve the task with reasonable 
success probability (8, 24). In the quantum- 
enhanced scenario, by contrast, copies of p are 
deposited in quantum memory two at a time 
and a Bell measurement across the two copies 
is performed to extract a snapshot of the state. 
In the quantum-enhanced scenario, we con- 
sider two different methods for analyzing the 
measurement data. The first method uses a 
specialized formula for estimating |tr(Qp)|, 
given in Appendix D2. Figure 2A depicts—as 
a function of the system size n—the number 
of experiments needed in the conventional 
and quantum-enhanced scenarios to achieve 
70% prediction accuracy, in which the data 
from the quantum-enhanced experiments is 
analyzed by this first method. Also shown is a 
theoretical lower bound on the number of ex- 
periments needed in the conventional sce- 
nario, proven in Appendix D4. The first method 
is explicitly tailored to the structure of this 
particular learning problem and so cannot 
be applied readily to other problems. Our 
second method is more flexible and hence 
more broadly applicable; we make predic- 
tions by feeding the measurement data to a 
supervised ML model based on a recurrent 
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neural network (25, 26, 27), as depicted in 
Fig. 2B. In contrast to the first method, the 
ML method does not require prior knowledge 
about the learning task. We train the neural 
network with noiseless simulation data for 
small system sizes (nm < 8). We then use the 
neural network to make predictions when we 
are provided with experimental data for large 
system sizes 8 < n < 20. We report the predic- 
tion accuracy, which is equal to the probability 
for correctly answering whether |¢r(Qip)| or 
|tr(Qop)| is larger. Figure 2C shows the per- 
formance of the ML model as we train the 
neural network. Despite the noisy storage 
and processing in the experimental device, 
we observed a substantial quantum advan- 
tage using both the specialized and ML meth- 
ods. Notably, when using ML, training on 
smaller systems sufficed for making good pre- 
dictions on larger systems, a further indication 
that the measurement data in the quantum- 
enhanced scenario is so revealing that no 
special-purpose method is needed to extract 
a clear signal. 

The second task we studied, which pertains 
to Theorem 3, was inspired by the recent ob- 
servation that quantum-enhanced experi- 
ments can efficiently identify the symmetry 
class of a quantum evolution operator, where- 
as conventional experiments cannot (9, 13). 
An unknown 7-qubit quantum evolution op- 
erator is presented, drawn either from the 
class of all unitary transformations or the 
class of time-reversal-symmetric unitary trans- 
formations (i.e., real orthogonal transforma- 
tions). We consider whether an unsupervised 
ML can learn to recognize the symmetry 
class of the unknown evolution operator 
on the basis of data obtained from either 
quantum-enhanced experiments or conven- 
tional experiments. An illustration is shown 
in Fig. 3A. 

In the conventional scenario, we repeatedly 
apply the unknown evolution operator to the 
initial state |0)°” and then measure each qubit 
of the output state in the Y-basis. Under 
T-symmetric evolution the output state has 
purely real amplitudes; hence the expecta- 
tion value of any purely imaginary observ- 
able, such as the Pauli-Y operator, is always 
zero. By contrast the expectation value of 
Y after general unitary evolution is generically 
nonzero but may be exponentially small and 
hence hard to distinguish from zero. In the 
quantum-enhanced scenario we make use of 
n additional memory qubits. We prepare an 
initial state in which the n system qubits are 
entangled with the 2 memory qubits, evolve 
the system qubits under the unknown evo- 
lution operator, swap the system and mem- 
ory qubits, evolve the system qubits again, 
and finally perform n Bell measurements, 
each acting on one system qubit and one mem- 
ory qubit. 
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Fig. 2. Quantum advantage in learning physical states. (A) Quantum 
advantage in the number of experiments needed to achieve =/0% accuracy. 
Here, Q corresponds to results running the best-known strategy for quantum- 
enhanced experiments, described in Appendix D2, and C corresponds to results 
running the best-known conventional strategy. The dotted line is a lower 

bound for any conventional strategy (C, LB) as proven in Appendix D4. Even 
running on a noisy quantum processor, quantum-enhanced experiments are seen 
to vastly outperform the best theoretically achievable conventional results 

(C, LB). (B) Supervised ML model based on quantum-enhanced experiments. n 
repetitions of quantum-enhanced experiments are performed and the data is 
fed into a gated recurrent neural network (GRU) (25, 26). The neurons in the 
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Fig. 3. Quantum advantage in learning physical dynamics. (A) Unsupervised 
MLmodel. We perform 500 repetitions of quantum-enhanced experiments (each 
accessing €;, twice) for every physical process €, and feed the data into an 
unsupervised ML model (Gaussian kernel PCA) (28) to learn a 1D representation 
for describing distinct physical dynamics €;, €2,.... Similarly, we also consider 
applying unsupervised ML to data obtained from 1000 repetitions of the 
best-known conventional experiments (each accessing €, once) for every 
physical process €,. (B) Representation learned by unsupervised ML for 1D 
dynamics. Each point corresponds to a distinct physical process €,. The vertical 
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GRU are aggregated to predict an output. (C) Training process of the supervised 
ML model. We train the supervised ML model to determine which of two 
n-qubit Pauli operators has a larger magnitude for the expectation value in an 
unknown state p with noiseless simulation for small system sizes (n < 8). We 
consider the cross entropy (34) as the training loss. Then we use the supervised 
ML model to make predictions with data from noisy quantum-enhanced 
experiments running on the Sycamore processor (10) for larger system sizes 
(8 <n < 20). We consider the probability to predict correctly as the prediction 
accuracy. The purple (Q) and gray (C) dots on the y-axis are the accuracy of 
the best-known quantum-enhanced and conventional strategy considered in (A). 
Random guessing yields a prediction accuracy of 0.5. 
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Each evolution operator is a one-dimensional 
(1D) or 2D n-qubit quantum circuit as shown 
in Fig. 3D. After sampling many different 
evolution operators from both symmetry clas- 
ses (and obtaining data from each sampled 
evolution multiple times), we used an unsu- 
pervised ML model (kernel PCA) (28) to find 
a 1D representation of the evolution oper- 
ators. The representations learned by the 
unsupervised ML model are shown in Fig. 3, B 
and C. By using the quantum-enhanced data, 
the ML model discovers a clean separation 
between the two symmetry classes, whereas 
there is no discernable separation into classes 
when using data from conventional experi- 
ments. The signal from the quantum-enhanced 
experiments was strong enough that the two 
classes were easily recognized without access 
to any labeled training data. 

In supplementary materials section A4, we 
analyzed the measurement data using the 
best-known special-purpose method specifi- 
cally designed to distinguish general unitary 
transformations from real orthogonal trans- 
formations. We found a quantum advantage 
similar to that obtained with the ML model. 
The revelation that unsupervised learning 
yields results that are competitive with a more 
customized analysis highlights the potential 
for discovering previously unknown phenom- 
ena with quantum-enhanced measurement 
strategies. Properties that are blurred beyond 
recognition by single-copy measurements 
are brought into sharp relief by two-copy 
measurements. 


Outlook 


We have investigated how quantum technol- 
ogy can enhance our ability to discover un- 
known phenomena occurring in nature. For 
a variety of tasks, we proved that quantum- 
enhanced strategies that use quantum mem- 
ory and quantum processing can predict 
properties of physical systems using exponen- 
tially fewer experiments than conventional 
strategies. This exponential advantage is 
achievable even if the amount of classical 
processing used in the conventional strategies 
is unlimited and when the physical system 
exhibits only classical correlations. Although 
many previous studies of quantum advan- 
tage have focused on computational tasks with 
known inputs, our work focused instead on 
learning tasks in which the goal is to learn 
about an a priori unknown physical system. 
This work provides a new approach to under- 
standing and achieving quantum advan- 
tage in quantum ML (29,30) and quantum 
sensing (1). 

Our experiments with up to 40 qubits in a 
superconducting quantum processor showed 
that a substantial quantum advantage is 
already evident when using today’s noisy 
intermediate-scale quantum platforms (37). 
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These experiments demonstrated that super- 
vised and unsupervised ML models (27, 32) 
employing data obtained from quantum- 
enhanced experiments could predict proper- 
ties and discover underlying structure in 
physical systems that are beyond the scope 
of conventional experiments. 

We envision that future quantum sensing 
systems will be able to transduce detected 
quantum data to a quantum memory and 
then process the stored data with a quan- 
tum computer. Although for now we lack 
suitably advanced sensors and transducers, 
we have conducted proof-of-concept experi- 
ments in which quantum data were directly 
planted in our quantum processor. Never- 
theless, the robust quantum advantage we 
have validated highlights the potential for 
advancing quantum platforms to unlock 
facets of nature that would otherwise remain 
concealed. 
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Organic acids and glucose prime late-stage fungal 


biotrophy in maize 


Matthias Kretschmer’, Djihane Damoo’, Sherry Sun't, Christopher W. J. Lee’, Daniel Croll’, 


Harry Brumer®, James Kronstad!* 


Many plant-associated fungi are obligate biotrophs that depend on living hosts to proliferate. 
However, little is known about the molecular basis of the biotrophic lifestyle, despite the 
impact of fungi on the environment and food security. In this work, we show that combinations 
of organic acids and glucose trigger phenotypes that are associated with the late stage of 
biotrophy for the maize pathogen Ustilago maydis. These phenotypes include the expression 
of a set of effectors normally observed only during biotrophic development, as well as the 
formation of melanin associated with sporulation in plant tumors. U. maydis and other 
hemibiotrophic fungi also respond to a combination of carbon sources with enhanced 
proliferation. Thus, the response to combinations of nutrients from the host may be a 


conserved feature of fungal biotrophy. 


ungi threaten human health, crop pro- 
duction, and food security (7, 2). Many 
economically important fungal pathogens 
of plants are obligate biotrophs that can- 
not be propagated outside of the host (3). 
Obligate biotrophs also include beneficial mycor- 
rhizal fungi that provide critical nutrients such 
as phosphate to 80% of plant species (4). In 


Fig. 1. Carbon 

sources influ- 

ence prolifera- 

tion, extra- 

cellular poly- 

saccharide, and 

melanin. (A) Cell 

numbers were 

compared upon 

growth in minimal 

medium with dif- 

ferent concentra- 

tions of glucose Cc 

(G), glucose plus 

malate (G+M), G 
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malate with aera- 

tion (G+M+A), or 
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G+M and G+M+A 5ug mit Tricyclazole 
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compared with | Abia | 
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1% 


PDB 


G+M 


the G 1.5% condi- 
tion. (B) Culture 
viscosity to detect 
extracellular poly- 
saccharide was 
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general, there is a lack of information about 
the nutritional requirements for fungal pro- 
liferation and development in host tissue, 
although genome analyses suggest that the 
loss of specific biosynthetic capabilities condi- 
tions a reliance on host nutrients (3, 5). 

The maize fungal pathogen Ustilago maydis 
can be grown axenically in culture but is ob- 
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ligately dependent on a plant host to complete 
the sexual phase of its life cycle (6-8). This 
phase involves the mating of compatible spo- 
ridia to establish invasive filaments; the de- 
livery of effector proteins; the induction of 
conspicuous tumors on leaves, ears, and tassels; 
and the formation of massive numbers of 
melanized spores in tumors (8). In addition 
to the economic impact of U. maydis on 
maize production, the fungus is unusual be- 
cause the tumor tissue has been prized as a 
culinary delicacy in Mexico since the time of 
the Aztecs (9). 


Induction of biotrophic phenotypes in culture 

During infection, U. maydis reprograms de- 
veloping tumor tissue into a sink for photo- 
synthate, and the carbon sources available to 
the fungus include carbohydrates and organic 
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3 or 10 days of growth with a viscometer and is reported in seconds of flow time. (C) Detection of melanin in U. maydis cultures in G, G+M, or 


G+M+A after 72 hours of growth. Melanin formation in the G+M medium is inhibited with 5 yg ml7 of tricyclazole. (D) Melanin is associated with cell pellets 

and is measurable in culture supernatants. (E) Melanin is cell associated with a range of intensities. In (A), (B), and (D), lines above the bar graphs indicate the 
statistical significance for the comparisons at the ends of each line. Significance levels for comparisons of the wild-type strain in different media are **P < 0.01 and 
***P < 0.001 according to analysis of variance (ANOVA) with a Tukey procedure as a post hoc test. Error bars indicate SD. 
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acids (7, 10-14). Given that maize plants carry 
out C4 photosynthesis of the nicotinamide 
adenine dinucleotide phosphate (NADP)- 
malic enzyme subtype in which 75% of CO, 
is initially fixed into malate, we hypothesized 
that metabolic adaptation to organic acids is 
a key determinant of biotrophic proliferation 
for U. maydis (7, 15). We tested this hypothesis 
by culturing the fungus in standard glucose 
medium with the addition of malate [glucose 
plus malate (G+M)] and found that this com- 
bination stimulated cell proliferation, increased 
culture viscosity, and triggered the accumu- 
lation of dark, pigmented cells (Fig. 1 and 
figs. S1 and S2). Other organic acids also trig- 
gered the same phenotypes in combination with 
glucose (fig. S3 and table S1). The increase in 
viscosity was due to the accumulation of extra- 
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cellular polysaccharides with a B-1,3 glucan 
structure commonly found in fungi (fig. $2). 
The pigment was cell associated and was re- 
lated to melanin, as determined with the spe- 
cific inhibitor tricyclazole (Fig. 1, C to E, and 
fig. S1). These phenotypic changes prompted 
an investigation of the relevance of the ob- 
served responses to the biotrophic devel- 
opment of U. maydis in maize. We therefore 
examined melanin formation, the role of or- 
ganic acid transporters, the transcription of 
genes for biotrophic effectors, and the con- 
tributions of mitochondrial functions and 
oxygen. 

Melanin formation during U. maydis spor- 
ulation in tumors is catalyzed by the laccase 
Lacl and the polyketide synthase Pks1 (6). 
However, additional enzymes may contribute 


UMAG_ 
11111 


to melanin formation because the genome 
encodes five other candidate laccases and 
four additional polyketide synthases (17, 18). 
RNA sequencing analysis of cells grown in 
glucose (G) versus cells from the G+M con- 
dition revealed that the transcripts for three 
pks genes (pks3, pks4, and pks5) were elevated 
in the G+M condition (Fig. 2A, figs. S4 and S5, 
and tables S2 and S3). These pks genes are pres- 
ent in a cluster of 15 genes on chromosome 12, 
and their transcript levels are regulated by the 
transcription factor Mtfl encoded within the 
cluster (Fig. 2A) (78). Ten genes in this cluster, 
including mtf1 (up-regulated 1714-fold), are 
expressed at a late stage of infection [12 days 
post-inoculation (dpi)] (74). The transcripts for 
mtf1 were elevated in the G+M condition, and 
we found that an méf7 deletion mutant did not 
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Fig. 2. Regulation and contributions of a melanin gene cluster. (A) A 
gene cluster encodes the transcriptional regulator Mtfl and three polyketide 
synthases. The heatmap shows averaged normalized expression values 

of three biological replicates for G and G+M at 24 and 72 hours. The 
expression values were logo transformed; the color code for the logio scale 
is from O (low expression; dark blue) to 5 (high expression; red). 

Sample expression values between O and 1 were set to 1 before logio 
transformation. (B) Melanin is reduced in the mtflA mutant upon growth 
in G+M medium. wt, wild type. (C) Spores of the mtflA mutant have 
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wt mtf1a 4h 6h 


24h 


reduced melanin formation over time during infection of maize 

seedlings. (D) Optical measurement of melanin revealed reduced 

content in spores of the mtflA mutant at 28 days. (E) Survival of 

mtf1A spores is reduced upon treatment with CuSO,. (F) Deletion or 
overexpression of the sporulation transcription factor unhl influences 
melanin formation. In (D) and (E), significance levels for comparisons 

of mutant and wild-type strains are *P < 0.05, **P < 0.01, and ***P < 0.001 
according to t test or ANOVA with a Tukey procedure as a post hoc test. 
Error bars indicate SD. 
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Fig. 3. Candidate transporters influence growth on dicarboxylates and virulence. (A) The jen2A and jen20A mutants have impaired growth on carboxylates 
and a combination of glucose and dicarboxylates. (B) Virulence of the jen2A jen20A double mutant is reduced on maize seedlings. Significance levels for 
comparisons of mutant and wild-type strains are *P < 0.05, **P < 0.01, and ***P < 0.001 according to at test for infection or ANOVA with a Tukey 
procedure as a post hoc test. A Kruskal-Wallis with Dunn analysis as a post hoc test was used to evaluate growth on aconitate. Error bars indicate SD. 


DI, disease index; nt, not tested. 


form melanin during growth in G+M medium. 
Furthermore, the mutant showed reduced mel- 
anin content in spores from tumor tissue (Fig. 
2, C and D). That is, the mutant caused di- 
sease in maize seedlings, which led to spore 
development, but the melanin content of 
isolated mtf1A spores was reduced by 22.8% 
(Fig. 2D). Additionally, the mt/1A spores showed 
incomplete maturation because their survival 
was reduced upon CuSO, treatment compared 
with wild-type spores (Fig. 2E). 

The transcription factor Unh1 also regulates 
sporulation in U. maydis, and unhi transcripts 
were elevated during growth on G and G+M 
media at 72 hours versus G at 24 hours, as well 
as in infected plants (table S3) (77, 19). The 
unhiA mutant was also compromised for mel- 
anin formation on G+M, thus further linking 
in planta sporulation to the melanin phenotype 
induced in culture (Fig. 2F). Complementation 
of the wnhIA mutation resulted in a modified 
pigment color, perhaps due to overexpression 
of the gene (Fig. 2F) (19). In addition to Unh1, 
several other virulence-associated regulatory 
factors, including protein kinases and transcrip- 
tion factors, also influenced melanin formation 
in the G+M condition (fig. S6). Therefore, these 
functions are candidate components of a regu- 
latory network for melanin formation. Overall, 
these results suggest that U. maydis has one 
melanin biosynthesis pathway that is dependent 
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on Lacl and Pks1 and a second pathway that is 
regulated by Mtfl and Unh1; the latter path- 
way is induced during sporulation in planta 
and in response to growth in glucose plus 
organic acids. 

The relevance of the response to organic 
acids for biotrophy was tested further by 
constructing mutants that lacked dicarboxyl- 
ate transporters and examining virulence in 
maize seedlings. We mined the genome to 
identify candidate transporters and examined 
their transcript levels in cells from the G+M 
condition (tables S2, S4, and S5). From this 
analysis, we identified two dicarboxylate trans- 
porters, Jen2 and Jen20, that were required for 
robust growth on specific organic acids (e.g., 
aconitate, a-ketoglutarate, succinate, or malate) 
and in combination with glucose (Fig. 3A). 
Deletion of both jen2 and jen20 attenuated 
virulence on maize, although some disease 
symptoms still occurred, indicating that ad- 
ditional transporters contribute to in planta 
growth (Fig. 3B). Overall, these results indi- 
cate that the ability to acquire organic acids 
contributes to the virulence of U. maydis 
on maize. 


In vitro expression of disease effectors 


The delivery of effector proteins to suppress 
plant defense and promote virulence is a key 
aspect of biotrophic development (/4). For 
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U. maydis, candidate effectors are expressed 
in transcriptional modules of coexpressed 
genes at defined stages of infection, including 
plant surface interactions, establishment of 
biotrophy, nutrient acquisition, and induc- 
tion of tumors (/4, 20). Many of the effectors 
are expressed only during growth in the host 
and define virulence-specific modules (74). We 
compared the transcriptional response to the 
G and G+M conditions with the established 
modules and found that transcripts encoding 
a subset of effectors involved in biotrophic 
development were elevated in response to car- 
bon sources (figs. S7 to S9 and table S2). That 
is, the transcript levels for some effectors were 
more highly elevated in G+M at 72 hours than 
in either G at 24 hours or G at 72 hours, as was 
demonstrated for the specific effectors Eff1-1, 
Tenl, Rsp3, Aful-3, Mig2-2, and Rrm67 (fig. S8) 
(20). Some of the effector genes also displayed 
elevated transcripts in G at 72 hours versus G 
at 24 hours, indicating a response to glucose 
depletion during the stationary phase. These 
effectors included the Afu3, Seel, Pit2, Cmul, 
Stp1, Stp4, Mig2-4, and Effl proteins (fig. S8 
and table S2) (20, 27). Many of the effector 
genes in U. maydis are found in clusters that 
affect virulence upon deletion (fig. S9) (7). In 
this regard, transcripts for genes in the major 
virulence clusters 2A, 6A, 10A, and 19A were 
elevated in the G+M at 72 hours condition 
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versus in G at 72 hours (fig. S9 and table S2). 
Transcript levels for genes in other clusters 
were also elevated in the G+M conditions (fig. 
S10). These clusters contained genes for the 
biosynthesis of itaconic acid and the siderophore 
ferrichrome A that are highly expressed in vivo 
(22), as well as genes for melanin and primary or 
secondary metabolism (fig. S10 and table S23). 


Overall, we conclude that the G+M culture 


condition triggers the transcription of genes | mitochondrial functions (e.g., electron transport 
normally expressed only during biotrophic | chain components and Fe-S-requiring proteins) 
growth in planta. were enriched in the Functional Catalogue 
(FunCat) analysis of the transcriptome data 
(figs. S4 and S11 and tables S2 and S6). We 
identified 155 genes in the module of general 
Mitochondrial functions and oxygen sensing | growth (/4) that had elevated transcripts at 
play roles in the metabolic adaptation of fungi | 24 hours in G+M versus G; one-third of these 
to the host environment (23). We found that | genes are annotated as encoding mitochondrial 


Mitochondrial functions and oxygen 
influence biotrophy 
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proteins. Transcripts related to Fe-S cluster- 
containing proteins of the electron transport 
chain complexes I to III, as well as transcripts 
for other components of complexes I to III 
and alternative oxidase, were elevated in the 
G+M condition (Fig. 4, A and B, and tables S2 
and S6). We therefore tested the influence of 
inhibition of electron transport chain complexes 
on the response to mixed carbon sources and 
found that inhibitors of complex I, the alter- 
native oxidase, and complex IV reduced mel- 
anin and culture viscosity but allowed growth 
in the G+M condition (Fig. 4A). In particular, 
inhibition of complex III reduced growth and 
viscosity, but melanin formation was still ob- 
served, perhaps indicating an uncoupling of 
pigmentation and proliferation. 

Oxygen is the terminal electron acceptor in 
the electron transport chain, so we examined 
conditions with enhanced aeration (A) to as- 
sess the role of oxygen in melanin formation 
(G+M+A) (Fig. 4, C to E). Specifically, aera- 
tion was enhanced in the G+M condition by 
the addition of B-glucanase to reduce viscos- 
ity, which led to a concentration-dependent 
reduction in melanin (Fig. 4C). The G+M+A 
cultures resulted in 13% higher oxygen levels 
than the G+M condition (Fig. 4D), and the 
speed of culture shaking influenced melanin 
formation, as expected for a negative influence 
of oxygen (Fig. 4E and fig. S1). It is possible that 
oxygen could affect not only mitochondrial 
functions but also oxidation reactions that 
influence the polymerization of melanin pre- 
cursors. Oxygen levels may also be relevant 
during pathogenic development in planta, 
especially given that infection negatively in- 
fluences transcript levels for chloroplast genes, 
including those encoding photosynthetic func- 
tions (24, 25). We evaluated the rates of photo- 
synthesis and respiration and found that 
photosynthesis was inhibited in infected tis- 
sue, especially during tumor formation, whereas 
respiration peaked at 6 dpi (Fig. 4F). The ob- 
served down-regulation of photosynthesis is 
consistent with the results of previous studies 
(25). Noninvasive two-dimensional oxygen mea- 
surements confirmed that oxygen levels were 
also reduced in tumor tissue (Fig. 4G). We con- 
clude that reduced oxygen tension is required 
for the in vitro biotrophic response to organic 
acids. Along with the impact of electron trans- 
port chain inhibition, this result implicates the 
mitochondria and mitochondrial functions as 
potential regulators of fungal biotrophy, per- 
haps through an influence on metabolite con- 
centrations and the generation of signals that 
regulate gene expression. 

We also tested additional fungal species for 
enhanced proliferation, viscosity, or pigmen- 
tation in media with glucose and organic acids 
to determine whether the response might be a 
general feature of biotrophic and hemibio- 
trophic fungi (fig. S12). We did not see an 
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influence on the human pathogens Candida 
albicans and Cryptococcus neoformans, the 
saprophyte Saccharomyces cerevisiae, the my- 
corrhizal fungus Laccaria bicolor, or the necro- 
trophic plant pathogen Sclerotinia sclerotiorum. 
By contrast, the biotrophic and hemibiotrophic 
fungi Ustilago hordei, Sporisorium reilianum, 
Fusarium oxysporum, and Verticillium dahliae 
showed increased cell numbers or growth rates 
on the G+M medium compared with on G alone 
(fig. S12). Therefore, the response to mixed car- 
bon sources may be a conserved feature of some 
biotrophic and hemibiotrophic fungi. 

This study reveals a complex response 
of U. maydis to organic acids that involves 
mitochondrial functions, oxygen sensing, spe- 
cific transporters, and transcriptional regula- 
tors of traits related to biotrophy. We also 
observed an accumulation of B-glucan in our 
culture conditions. This extracellular poly- 
saccharide is a likely component of the 
mucilage matrix that accumulates during 
sporulation in tumors (26). The response to 
combinations of carbon sources may be a 
conserved feature of biotrophic fungal patho- 
gens as well as other microbes that associate 
with plants. For example, obligate mycorrhizal 
fungi respond to lipids and sugars with pro- 
liferation and pre-spore formation, and rhizobia 
bacteria respond to dicarboxylates such as 
succinate and malate from legume hosts during 
nodulation and nitrogen fixation (27-3D). Fur- 
thermore, features of bacterial nodulation such 
as extracellular polysaccharide production, 
malate metabolism, and oxygen sensing are 
shared with fungal tumor formation (32, 33). 
The response to defined combinations of nu- 
trients may therefore be a general theme in 
plant-microbe interactions. 
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Circadian alignment of early onset caloric restriction 
promotes longevity in male C57BL/6J mice 


Victoria Acosta-Rodriguez’, Filipa Rijo-Ferreira’?+, Mariko Izumo’, Pin Xu‘, Mary Wight-Carter®, 


Carla B. Green’*, Joseph S. Takahashi??* 


Caloric restriction (CR) prolongs life span, yet the mechanisms by which it does so remain poorly 
understood. Under CR, mice self-impose chronic cycles of 2-hour feeding and 22-hour fasting, raising the 
question of if it is calories, fasting, or time of day that is the cause of this increased life span. We 
show here that 30% CR was sufficient to extend the life span by 10%; however, a daily fasting interval 
and circadian alignment of feeding acted together to extend life span by 35% in male C57BL/6J 

mice. These effects were independent of body weight. Aging induced widespread increases in gene 
expression associated with inflammation and decreases in the expression of genes encoding components 
of metabolic pathways in liver from ad libitum-fed mice. CR at night ameliorated these aging-related 
changes. Our results show that circadian interventions promote longevity and provide a perspective to 


further explore mechanisms of aging. 


aloric restriction (CR) without malnutri- 

tion or starvation, which is achieved by 

reducing ~30% of daily food intake, is 

the most effective nonpharmacological 

intervention that improves life span in 
model organisms (7), but the underlying mech- 
anisms remain unclear (2-6). Classical CR pro- 
tocols in mice lead to a temporal restriction of 
food intake with a long (>22 h) fasting interval 
because mice consume the food as soon it be- 
comes available (7-9). Timed food administra- 
tion is a potent signal that entrains circadian 
clocks in peripheral tissues such as liver (10-12). 
Thus, in addition to reducing daily energy 
intake, CR resets complex circadian programs 
of gene expression in tissues throughout the 
body (73-15). Although decreased energy intake 
is commonly thought to be the critical factor 
that extends life span, it is possible that the 
timing of food intake is also a key component. 
The changes caused by time-restricted feeding 
can have profound effects on physiology (16). 
For example, mice (which are nocturnal) fed a 
high-fat diet only during the day gained sig- 
nificantly more weight than mice fed the same 
diet only during the night (77). Also, mice fed 
a high-fat diet restricted to an 8-hour win- 
dow during the night were protected against 
diet-induced obesity, hepatic steatosis, hyper- 
insulinemia, and inflammation compared with 
mice fed ad libitum (AL) (18, 19). Thus, tem- 
porally restricted feeding at night, which is the 
normal active and feeding time of day for mice, 
is beneficial. 
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Although the timing of food intake can 
have an impact on health, it remains unclear 
whether the timing and frequency of feeding 
also affect life span in mice (J6, 20, 21). Food 
consumption triggers behavioral and meta- 
bolic changes in mammals that have profound 
impacts on health status (22). We studied the 
contributions of feeding time and fasting 
under CR and compared behavioral, meta- 
bolic, and molecular outcomes throughout the 
life span. We tested five different CR protocols 
and an AL control group using automated 
feeders (7). After 6 weeks of baseline AL food 
access, C57BL/6J male mice were subjected to 
30% CR. Mice were fed nine to ten 300-mg food 
pellets containing 9.72 to 10.8 kcal every 24 h 
starting at the beginning of the day (CR-day) or 
night (CR-night), similar to classical protocols 
in which mice consumed their food within 2 h 
as one meal (7). To prevent the 2-h binge- 
eating pattern and to reduce the fasting in- 
terval to ~12 h, two additional CR groups of 
mice were fed a single 300-mg pellet (1.08 kcal) 
delivered every 90 min to distribute the food 
access over a 12-h window either during the 
day (CR-day-12h) or during the night (CR- 
night-12h). A fifth CR group of mice was fed a 
single 300-mg pellet every 160 min continu- 
ously spread out over 24 h (CR-spread) to 
abolish the rhythmic pattern of food intake 
and to prevent any fasting intervals (Fig. 1A). 


Behavioral and body weight dynamics 

with age 

To select the diet for the longevity studies, 
we first compared standard laboratory chow 
(Teklad Global 2018) with two different preci- 
sion food pellets with similar caloric content 
but different compositions: a grain-based diet 
that we used previously (F0170) (7) and a pu- 
rified diet (FO075) (fig. SIA). We found that 
mice fed the purified diet showed body weight 
gain similar to the standard laboratory chow; 
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however, the mice fed grain-based pellets 
gained significantly more weight (fig. S1B). 
Because the composition of grain-based diets 
is known to vary by batch and by season of the 
year (23) and because longevity experiments 
require at least 4 years, we chose the purified 
diet that could be completely defined and 
maintained over the entire duration of the 
life-span experiments and did not cause ex- 
cessive weight gain compared with standard 
laboratory chow. We used an automated feed- 
ing system (7) and monitored feeding and 
wheel-running activity of individually housed 
mice continuously throughout their life span. 
This allowed us to measure behavioral and 
metabolic changes in mice under all six feed- 
ing conditions as they aged. 

In agreement with previous studies (24), 
mice under unrestricted feeding (AL) gradually 
increased their body weight until 20 months 
of age, after which they showed an age-related 
decline (Fig. 1B and fig. S2). All CR groups 
maintained lower body weights throughout 
their life span, consistent with lower food in- 
take (Fig. 1B and fig. S3). We previously showed 
that CR-day mice gained more weight than 
CR-night mice with the grain-based diet (7), 
but this effect was not reproduced with the 
purified diet (fig. S2), perhaps because of the 
difference in fat source in the two diets (fig. 
S1A). Long-term recordings of feeding events 
showed that mice adjusted their feeding pat- 
terns to match the externally controlled avail- 
ability of food (including daytime feeding and 
24-h spread-out feeding). These feeding pat- 
terns were consistently maintained through- 
out their life span (Fig. 1, C and E; fig. S4A; 
and data S1). Mice in the AL group normally 
consumed ~75% of their food at night and 
maintained this pattern of food consumption 
throughout their life span, with a gradual in- 
crease in food consumption with age after the 
first year (Fig. 1F and fig. S4A). Mice in the 
CR-night-2h and CR-day-2h groups with 24-h 
access to 30% CR (relative to AL controls for 
the first 200 days of study) rapidly consumed 
their daily allotment within 2 h, as previously 
described (Fig. 1, C and E) (7), and this 2-hour 
intake pattern was maintained throughout 
their life span (fig. S4A). Although AL mice 
increased their food consumption after 1 year 
of age, the amount of food was not increased 
for the CR groups (fig. $3), so the CR increased 
from 30 to ~40% compared with AL at later 
ages. Similarly, animals exposed to CR with 
food access spread over 12 or 24h also adapted 
to the imposed meal pattern by eating each 
pellet as soon as it became available, which was 
every 90 min (CR-day-12h and CR-night-12h) 
or every 160 min (CR-spread) (Fig. 1, C and E, 
and figs. S3 and S4A). When examining the 
median phase of feeding, which is the time at 
which mice ate 50% of their daily allotment, 
we observed that the phase for classic CRs 
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was 1h after the food onset [Zeitgeiber time 
(ZT) 1h for CR-day-2h and ZT 13 h for the 
CR-night-2h]. For the CR-spread group, the 
phase was ZT12 because the mice ate equal 
amounts during the day and night (fig. S4A). 
The feeding pattern was also consistent with 
daily changes in body weight (fig. S4B). With 
the exception of the CR-spread group, in which 
the food was equally distributed through- 
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out the day, body weight changed through- 
out 24h with a significant increase during the 
feeding time (fig. S4B). This finding was more 
pronounced in classic CR protocols, with the 
highest body weight gain of 3 g occurring 
between ZTO and ZT4 in CR-day-2h mice 
and between ZT12 and ZT16 in CR-night-2h 
mice. These body weight gains are consist- 
ent with the observation that mice eat their 
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1200 


throughout the experiment. (C) Examples 
of double-plotted actograms from each 
experimental group overlaying wheel- 
running (black histograms) and feeding 
(red dots) behaviors. All mice were on AL 
feeding for the first 6 weeks (period above 
the line on the right of the actograms) 
before the CR began. (D and E) Twenty- 
four-hour profile of wheel-running activity 
(D) and food intake (E) at different ages 
(averaged over 21 days, n = 36 to 43 mice) 
for each group. (F) Energy intake per 
day (left axis) and number of food pellets 
per day (right axis) for each group through- 
out the experiment. For the AL group, the 
dark line is average and the gray shading is 
SE. All CR groups were limited to 70% of 
AL consumption for the first 200 days of age 
and were not adjusted after 200 days, 

so no variation is observed. (G) Daily wheel- 
running activity (average counts/min over 
24 hours + SE) throughout the experiment. 


entire allotment (2.7 to 3 g) as one single meal 
within 2 h. 

All groups maintained a normal nocturnal 
locomotor activity pattern for life, with the 
exception of the day-fed mice, which tended 
to have more daytime activity (Fig. 1, C and D). 
Overall, these long-term recordings showed 
that when food was restricted to the daytime, 
mice interrupted their “rest phase” to eat but 


2 of 11 


RESEARCH | RESEARCH ARTICLE 


Median Log-rank 
Lifespan Mantel-Cox 
(days) (p-value) 


AL 792 

CR-spread 875 f* 
CR-day-12h 942 }* 
CR-day-02h 959 f+ 
CR-night-12h 1058 [++ 
CR-night-02h 1068 * + 


Survival(%) 


0.0061 
0.0001 
0.0002 
<0,0001 
<0,0001 


CR 


No fasting 


+ Fasting & 


Misaligned Timing 


+ Fasting & 


Aligned Timing 


Median lifespan 
increase vs AL 


20% 


ie} 300 600 900 1200 
Age (days) 
6mo 12mo 18mo 24mo 30mo 36mo 
* * 
: > 
5] ta a 
us Pa 
* © 
oi d: B 
ee J = 2 
= yw = g 
€ 2! * aa Qa 
£ 
= 407 F i * * > 
| on a! s a 
8 59 parent) fe ms] 8 : 2 
8 20 pt bites | Sie! ||, 8 
> ' “. ori aa" baat : a 
30 —lt Lf pee ape ae 5 
gO Taf. é 1 a | * * * 9 
e ose | . ra od Sse plas Se 
d a a ae iy ere. a wat a ai & 
4074 a "a. *¥ * * s 
20: aad roll whe . an" a=). Sis 5 & 
Tots a si ade pes = 
s . any we : 1? Se 7} i iD 
0 2 a] LS 5 4 2 ron 
“, e x x 9 
i = E 
20. eo Dees || katy je as a = 
4 «| ae ot = 8 g Ls, a a = 
1 eae || ee ee ie raat 3 
0 . iu aie a rte a N 
S QH S&S & S 6 9 S QO © QB © & 
FSF SES SE SF SPS SPS SP FS SS SH 
Lifespan 
Cc AL | CR- CR- CR: : 
spread day night Liver | zz ai 
2h 2h — 
aed 13% | 9% | 22% | 16% | 17% | 4% Lungs | i | 
lepatocellular 5 el S = 
«lees 8% | 19% | 19% | 29% | 23% | 15% im 
8 lastocyiosarcoma | 21% Kidneys | {) 
Lymphoma 3% | 0% | 0% | 0% | 3% | 4% 
: eee ete] Seeen 
Jeoplasia 16% | 16% | 4% | 10% | 3% | 4% me 
Petdopniiemacrepheas| oe, | 9% | 7% | 6% | 3% || 19% alae I| ae ‘ 
= | pnenea ent om] ox Ee onal | me br splea 
renal -day- 
3 iad 13% | 16% | 7% | 16% | 17% | 19% Gland lf ma CR-day-12h 
Histiocytcsarcoma | 5% | 16% | 19% | 16% | 13% | 19% Heer | CR-day-2h 
.,, [Giomeruionephrits | 24% | 3% | 11% | 10% | 17% | 7% ma CR-night-12h 
3 , 
§ |Histooyicsarcoma | 3% | 3% | 0% | 0% | 7% | 0% other Bi CR-night-2h 
A 
Lymphoma 3% | o% | 0% | 0% | 7% | 7% + . : 1 
8 Histiocytic sarcoma 3% | 9% | 15% | 18% | 17% | 4% fe) 100 200 
& Lymphoma 3% | 0% | 0% | 3% | 7% | 0% Frequency 


Fig. 2. Extent of CR-mediated increases in longevity depend on feeding time. (A) Survival curves 
for each group (n = 43 for AL and n = 36 for each of the CR groups) are shown in the left panel and 
the median life span (days) is shown in the inset. Right panel summarizes the results, showing the 
increase in life span from timed feeding with the largest increase when food is restricted to night. 

(B) Correlation plots comparing life span (days) for each mouse with its daily averaged total activity 
(counts/min) at different ages. Increased activity significantly correlates with longer life span in old, 
but not young, mice in all groups (see asterisks, Spearman correlation). (©) Necropsy followed by 
histopathology results showing pathologies and diseases (left) and tissues mostly affected (right) at the 


time of death for each feeding condition. 
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maintained most activity during the nighttime 
(Fig. 1, C and D). Therefore, feeding and loco- 
motor activity are misaligned with daytime 
feeding in these animals throughout their life 
span, which would be expected to lead to ad- 
verse metabolic consequences (25, 26). Ac- 
tivity of the mice declined as they aged (27), 
with AL mice having the lowest activity levels 
compared with the CR groups between 6 and 
18 months of age [Fig. 1G and fig. S5; two-way 
analysis of variance (ANOVA); age, P < 0.0001; 
feeding, P < 0.0001; interaction nonsignifi- 
cant (NS)]. 


Life-span extension by CR depends on 
feeding time 


We investigated the contribution of calories, 
feeding time, and fasting period on longevity. 
CR was sufficient to extend the median life 
span in male mice, but the range of this ex- 
tension depended on when the food was con- 
sumed (Fig. 2A). The percentage of life-span 
extension varied across conditions. Consistent 
with other reports, AL mice had a median life 
span of 792 days (24, 28). CR-fed mice lived 
10 to 35% longer than AL mice depending on 
the CR group. The CR-spread group, which 
had a 30% reduction in calories but with feed- 
ing spread throughout the day-night cycle, had 
a median life span of 875 days, which is 10.5% 
longer than that of AL mice, demonstrating 
that CR alone without time restriction or fast- 
ing is sufficient to extend longevity. The CR- 
day-12h and CR-day-2h groups had median 
life spans of 942 and 959 days, respectively, 
which are 18.9 and 21.1% longer than the life 
spans of AL mice, respectively. Thus, in addi- 
tion to the reduction in calories, a minimum 
of 12 hours of fasting induces its own ben- 
efits on longevity. There were no significant 
differences in life span when day-fed mice 
fasted for ~22 hours versus 12 hours, indicat- 
ing that 12 hours of fasting is sufficient. CR- 
night fed mice outlived both CR-day groups: 
The CR-night-12h and CR-night-2h mice had 
median life spans of 1058 and 1068 days, re- 
spectively, which are 33.6 and 34.8% longer 
than the life spans of AL mice, respectively. 
Again, there was no additional benefit of 
~22 hours of fasting compared with 12 hours 
of fasting in these groups fed at night, indi- 
cating that 12 hours of fasting is sufficient for 
prolonging life span. There was a significant 
extension of life span by CR-night-2h mice 
(34.8% extension) over CR-day-2h mice (21.1% 
extension) (log-rank Mantel-Cox, P < 0.05), 
which differed in the relative phase of food 
consumption by the mice. It is possible that 
sleep disruption due to misaligned feeding 
could contribute to the difference in life span. 
Further studies are required to determine 
whether sleep is affected and, if so, if potential 
sleep disruptions can contribute to the differ- 
ences in life span observed between CR-day 
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versus CR-night animals. However, it has re- 
cently been shown that sleep homeostasis is 
maintained in mice under a restricted feeding 
schedule in which food is only available for 4h 
in the middle of the “sleep” phase (ZT4 to ZT8) 
(29). In all five of the CR groups in this study, 
the mice consumed exactly the same number 
of daily calories throughout their life span 
(figs. S3 and S4; two-way ANOVA; age, P < 
0.0001; feeding, P < 0.0001; interaction, P < 
0.0001), yet the pattern and circadian phase 
of feeding had major effects on life span. A 
>12-hour fasting interval combined with noc- 
turnal (normal) feeding yielded the greatest 
benefits on life span. Thus, calories are pro- 
cessed differently depending on when they 
are consumed, and anti-aging interventions 
such as CR can be optimized by timing them 
to a specific time of day. Maximum life span, 
estimated as the 10% longest-lived mice in 
each group, was significantly longer in all of 
the CR groups compared with AL except for 
the CR-spread group (exact Fisher’s test, P < 
0.05) (data S1). Among the CR groups, only 
CR-night had a significant increase in maxi- 
mum life span compared with CR-spread (exact 
Fisher’s test, P =0.0256 CR-night versus CR- 
spread). This suggests that feeding and fasting 
cycles that are in sync with internal circadian 
clocks (CR-night-2h) extend both the median 
and maximum life span, which is indicative 
of delaying the aging process as opposed to 
delaying the onset of a single disease. 

Necropsy followed by histopathology re- 
vealed that all groups had similar diseases at 
death, but in the CR groups, these diseases 
occurred at older ages (coinciding with lon- 
ger life spans) as previously reported (8). 
Neoplasias were the most frequent pathology 
in all groups, with histiocytic sarcomas being 
the major cause of death, followed by hepato- 
cellular carcinoma (Fig. 2C and data S1). His- 
topathology analysis of the target tissues also 
revealed the highest incidence of lesions in the 
liver (Fig. 2C). 


Age-related decline in activity predicts lower 
survival in mice 


To evaluate whether behavioral or metabolic 
parameters correlated with longer life spans, 
we compared feeding (total daily intake), body 
weight, and wheel-running activity (total daily 
activity and percentage nighttime activity) with 
life span. We found no correlation of life span 
with food intake or body weight at any age 
(figs. S6 and S7). However, in all feeding con- 
ditions, daily locomotor activity level pos- 
itively correlated with longer life span after 
18 months of age (Fig. 2B). Additionally, higher 
activity levels during the normal circadian 
phase (at night) after 24 months of age also 
positively correlated with longer life span (figs. 
S8 and S9). Because voluntary wheel-running 
activity does not affect life span in mice (30), 
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our results suggest that the level of wheel- 
running activity after 18 months of age could 
be a biomarker for health span. Thus, the activ- 
ity level of mice at older ages (>18 months) can 
be used as a predictor of longer life span. 


CR promotes widespread metabolic benefits 


Body composition analysis at 12 and 20 months 
of age showed that although fat mass was sig- 
nificantly higher in the AL group versus the 
CR groups, there were no differences in fat mass 
among the CR groups (fig. S10). We assessed 
metabolic markers from plasma at 6 and 
19 months of age. Insulin levels increased 
with age under AL, and such increases were 
attenuated by all CR groups, which maintained 
low insulin levels at both ages (fig. S11). In 
young mice, CR groups had similar insulin 
levels yet lower glucose levels in plasma com- 
pared with the AL group (figs. S10 and S11), 
suggesting that improved insulin sensitivity 
was associated with lower food intake. As the 
mice aged, similar levels of circulating glucose 
were found in all feeding groups even though 
AL-fed mice had higher insulin levels than CR 
groups, suggesting that CR generally protects 
against age-related insulin resistance. Similar 
to what was seen with insulin, there was a 
significant age-related increase in leptin under 
AL that did not occur in the CR groups. Al- 
though all CR groups maintained lower leptin 
levels than the AL group at 19 months of age, 
only the CR-night groups had leptin levels that 
were significantly lower at younger ages, sug- 
gesting that alignment of feeding may play a 
role in regulating satiety at both ages. Glucagon- 
like peptide 1 (GLP-1) is an intestinal hormone 
that is secreted after meals and decreases 
blood sugar levels by promoting insulin secre- 
tion and suppressing glucagon release (37). We 
found that although the levels of GLP-1 did 
not change with aging, the longest-lived (CR- 
night) groups had significantly lower levels at 
older ages compared with AL (fig. S11). This 
indicates that CR-night mice may have had an 
improved sensitivity to GLP-1 in regulating 
insulin secretion and glucose levels. Overall, 
the longest-lived CR groups had improved hor- 
monal profiles, insulin sensitivity, and glucose 
homeostasis as they aged. 


Gene expression changes with aging and CR 


To determine the effects of aging and feeding 
at the molecular level, we assessed circadian 
gene expression patterns using RNA sequenc- 
ing (RNA-seq) in mouse liver in all six feeding 
conditions at two ages: young mice at 6 months 
and old mice at 19 months. We chose to pro- 
file the liver because it is a major metabolic 
target of the circadian clock system (32) and 
because it had the highest incidence of age- 
related lesions among target tissues (Fig. 2C). 
We chose 6 months of age to assess fully mature 
adult mice and 19 months of age to assess aged 
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mice before their obvious decline in body weight 
(Fig. 1B) and before there was >10% mortality 
in AL mice (Fig. 2A). In each group, we pro- 
filed the liver at 12 time points every 4 hours 
for 48 hours across two circadian cycles in 
mice transferred to constant darkness to assess 
circadian rather than diurnal cycles (two bio- 
logical replicates x 12 time points = 24 liver 
samples per feeding condition). We chose to 
perform these gene expression experiments 
in constant darkness to determine whether a 
rhythmic gene profile is circadian. Because 
liver-cycling gene expression patterns are 
strongly driven by feeding cycles (33), the 
absence of light:dark (LD) cycles would be 
expected to yield results for rhythmic gene 
expression in the liver similar to those seen 
in LD cycles reported previously (11, 33). Un- 
biased principal component analysis showed 
that (i) samples from young and old groups 
clustered separately under AL, (ii) young groups 
under CR clustered together and separately 
from AL, and (iii) old groups under CR clus- 
tered together between aged-AL and young- 
CR groups (Fig. 3A). This suggests that at the 
molecular level, the aging process is differ- 
ent between mice fed AL or CR, and aged-CR 
mice maintained a liver gene expression pat- 
tern more similar to that of the young animals. 

To address the overall impact of aging at the 
molecular level, we performed differential gene 
expression analysis between young and old 
mice (data S2 and S3). A total of 2599 genes 
(18.6% of expressed genes) were differentially 
expressed with aging under AL feeding (Fig. 3B 
and fig. S12). Of these, 2031 genes were up- 
regulated and 568 were down-regulated. Gene 
ontology (GO) analysis revealed that the up- 
regulated genes were highly significantly 
related to immune system processes and 
inflammation (Fig. 3C), whereas the down- 
regulated genes were related to metabolism 
(Fig. 3D and data S4). This is consistent with 
previous reports showing that increased in- 
flammation and senescence are hallmarks of 
aging (3) and recent work on glycine-serine- 
threonine metabolism in longevity (34). Among 
the up-regulated genes were Cd36 (logaFC = 
3.8, adjusted P = 2.95 x10~°”), which is a mul- 
tifunctional glycoprotein that acts as a re- 
ceptor for ligands such as thrombospondin, 
fibronectin, amyloid beta, oxidized low-density 
lipoprotein, and long-chain fatty acids, among 
others (35), and the peroxisome proliferator- 
activated receptor gamma (Pparg, logsFC = 
1.8, adjusted P = 1.5 x 107), a transcription 
factor also associated with immunosenescence 
during aging (Fig. 3C) (36, 37). Other genes of 
note were Adoral, Serpinel, Themis, 10 Toll- 
like receptor (77r) genes, Spon2, and Zcchcl1 
(data S2 and S3). Among the down-regulated 
genes, there was an enrichment in key enzymes 
of amino acid (Gnmt and Agwt) and choles- 
terol metabolism such as HMG-CoA reductase 
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Fig. 3. Gene expression signatures in liver change during aging in AL mice. 
(A) Principal component analysis of gene expression from liver mRNA-seq. 
mRNA-seq data are from 48 mice for each feeding condition (n = 24 from 

6 months of age, n = 24 from 19 months of age), with livers collected every 

4 hours over 48 hours while mice were in constant dark. Circles indicating young 
AL mice (solid line) are clustered together, and triangles indicating aged AL 
mice (dashed line) are in a distinct cluster. Liver gene expression data among 
CR groups cluster together independently of age. (B) Volcano plot showing 


(Hmgcr), which is implicated in liver disease 
and hepatocellular carcinoma (38-41) (Fig. 3D). 
Other metabolic genes of note were Got, Lepr, 
Lpin1, Pfkfb3, Scap, Hsd17b2, Hsd3b5, as 
well as 14 cytochrome P450 (Cyp) genes and 
28 solute carrier (Sic) genes (data S2 and S3). 
Overall, there was an increase in the expres- 
sion of inflammatory genes and a decrease 
in the expression of metabolic genes. 


CR alone rescues most age-related changes 
observed under AL 


To determine the overall effect of CR, we ana- 
lyzed which genes underwent age-related 
changes in any condition and identified those 
with expression levels that remained protected 
under CR versus AL. Across all feeding condi- 
tions, there were 4077 genes with expression 
that was up- or down-regulated with aging 
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(Fig. 4A and data S2), indicating that 29% of 
the liver transcriptome is susceptible to aging- 
related changes under any condition tested. 
AL-fed mice had the highest percentage of 
genes that changed with aging (18%), whereas 
all of the CR groups had lower overall changes 
in gene expression with age (Fig. 4A). The CR- 
night-2h group had the smallest overall change 
in gene expression with age (4%) (Fig. 4A). Age- 
related fold change of individual genes in the 
CR-spread and CR-day groups were partially 
decreased compared with the fold changes seen 
in AL, whereas CR-night-2h strongly attenuated 
these age-related changes. Figure 4B shows 
these results as correlation plots of age-related 
fold change compared with AL for all CR 
groups. If there were no rescue by CR, the data 
points would fall on the unity line (slope = 1), 
whereas with complete rescue by CR, the 
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differential gene expression in young versus aged AL mice. Red denotes genes 
for which expression is significantly increased in aged AL mice; blue denotes 
genes that are significantly decreased in old AL mice. The number of mRNAs in 
each category are shown in the right panel. (©) GO terms of genes that are 
increased in aged AL mice (top) and examples of gene expression (bottom). 
Gray indicates young AL, and red indicates aged AL. (D) GO terms of genes that 
are decreased in aged AL mice (top) and examples of gene expression (bottom). 
Gray indicates young AL, and blue indicates aged AL. 


data points would fall on the horizontal line 
(slope = 0). As seen in Fig. 4B, the regression 
line for the CR-night-2h group was almost 
completely flat, demonstrating that CR-night- 
2h strongly reduced age-dependent changes 
in gene expression. Approximately 50% of age- 
related changes in gene expression under AL 
(1233 genes) were restored in every CR con- 
dition (Fig. 4C and data S2) by rescuing 44% 
of up-regulated genes (inflammation and im- 
mune function) and 60% of down-regulated 
genes (metabolic pathways). This result indi- 
cates that CR alone prevents most of the age- 
related changes observed in the control AL 
condition. 

GO analysis showed that the genes protected 
by CR were also associated with immune func- 
tion, inflammation, and metabolism (Fig. 4C 
and data S4) (34). Among these genes were those 
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Fig. 4. CR ameliorates age-related changes in liver gene expression 
observed under AL conditions. (A) Schematic comparison of differential 
gene expression between young and old mice in the six feeding conditions. 
Circles show the percentage of genes that are unchanged between young 
and aged mice (gray), increased in aged mice (red), and decreased in aged 
mice (blue). Pie chart on the right shows the percentage of genes susceptible 
for age-related changes in any feeding condition. (B) Spearman correlation 
plots comparing changes in gene expression between the aging DE genes 
between AL and CR groups (aging DE genes are defined here as the 

4077 genes that change with age in any of the six feeding conditions tested). 
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(C) Schematic representation of genes that are protected from age-related 
changes in every CR group (left) with GO terms of those significantly up- 
regulated (middle) or down-regulated (right) in aged AL mice. Represented 
are 10 nonredundant of the top 25 most significant enriched terms. Examples 
of age-related fold changes (logaFC + SE) in gene expression are shown 
below for all feeding conditions. Gray-shaded areas indicate FC < 1.5 
considered as not significant change. (D and E) Schematic representation, 
GO, and representative genes that maintain similar levels between young 
and old ages due to fasting (day or night) (D) and circadian alignment of 
feeding and fasting cycles (E). 


6 of 11 


RESEARCH | RESEARCH ARTICLE 


encoding for microtubule-associated protein 
tau (Vapt) and apolipoprotein A-IV (Apoa4), 
the expression of which was up-regulated in 
old-AL mice but maintained at lower (young) 
expression levels in all CR groups (Fig. 4C). 
These genes have been linked with aging and 
neurodegeneration in Alzheimer’s disease 
(3, 42). A similar aged-related up-regulation 
occurred in the liver and was reduced by CR. Of 
the metabolic genes that declined with age under 
AL, such as insulin-like growth factor-binding 
protein 2 ([g/bp2), CR strongly attenuated changes 
in expression in all five CR groups. 


Fasting-related genes 


To evaluate gene expression changes caused 
by fasting, we compared genes that were dif- 
ferentially expressed only in AL and CR-spread 
but remained constant in all the other four CR 
groups with some degree of fasting. We found 
159 genes of this type that could be potential 
candidates accounting for the change from 
10 to 20% life-span extension (Fig. 4D and 
data S2). Among these were collagen type XII 
alpha 1 chain (Col12a1), which is associated 
with cancer (43) and is elevated in liver fibro- 
sis, and pleomorphic adenoma gene 1 (Plag’), 
an oncogene associated with hepatoblastoma 
and age-related decrease in skeletal muscle 
(44, 45). Other genes included chromatin as- 
sembly factor 1 subunit A (Chafia) (46) and 
histidine ammonia-lyase (Hal) involved in 
amino acid metabolism. GO analysis revealed 
that these fasting-related genes were also en- 
riched for thermogenesis pathways. 


Time-related genes 


To evaluate the beneficial effect of feeding 
time, we searched for genes that maintained 
similar levels at young and old ages only in the 
CR-night fed groups with the longest life spans 
but were differentially expressed in the AL, 
CR-spread, and CR-day fed groups. We found 
68 genes that were specifically protected in the 
CR-night groups (Fig. 4E and data $2). GO 
analysis revealed specific subsets of genes 
involved in the immune system and inflamma- 
tion, such as glutathione S-transferase mu 3 
(Gstm3), which protects against oxidative 
stress; lymphocyte antigen 6 family member E 
(Ly6e), which regulates T-cell proliferation, 
differentiation, and activation; triggering 
receptor expressed on myeloid cells-like 2 
(Treml2); and proinflammatory cytokines 
such as interleukin-1f (///b) that are increased 
in the elderly (3). Thus, circadian alignment of 
feeding time adds another level of protection 
of immune function and age-related inflam- 
mation beyond that seen with CR and fasting. 


Circadian cycling of gene expression with aging 


Robust circadian rhythms are at the core of 
a healthy physiology, and rhythm amplitude 
decreases in response to infectious diseases 
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and aging (22, 47, 48). We investigated how 
feeding conditions, timing, and CR influenced 
circadian cycling of gene expression in young 
and aged mice. To search for circadian cycl- 
ing genes, we performed RNA-seq from liver 
samples collected every 4 hours for 48 hours 
(data S4 and S5). We used very strict criteria to 
identify circadian cycling genes by selecting 
only those genes that were significant in 
three out of three different algorithms (JTK_ 
CYCLE, ARSER, and RAIN) with a P value < 
0.05 and a false discovery rate (FDR) < 5% 
(FDR < 0.05). We found 1718 rhythmic genes 
in young AL mice and 1507 rhythmic genes in 
old AL mice (see Fig. 5A, heatmaps of cycling 
genes, and data S5). The overlap of these 
two genes sets was 694 genes. Therefore, cir- 
cadian cycling genes were both lost and gained 
with age (Fig. 5A). Figure 5B illustrates six cy- 
cling genes in young and old mice. Of the 
four circadian clock genes, Arntl, Nrid1, Per1, 
and Per2, three had a lower amplitude in old 
mice. Two metabolic pathway genes, Gys2 and 
Pck1, also had lower amplitude with age con- 
sistent with the overall decline in average gene 
expression in metabolic pathways (Fig. 3, B 
and D; see fig. S13 for example circadian pro- 
files of pro-aging and pro-longevity genes). We 
compared the circadian phase and ampli- 
tude (fold change between trough and peak 
on the first and second circadian cycle) for the 
694 shared cycling genes in young and old 
mice (Fig. 5A). There was no change in the 
phases of cycling genes with age; however, the 
amplitude of these cycling genes was lower, 
as seen by the deviation of the regression line 
from unity (slope = 0.5941 + 0.009, P < 0.0001) 
(Fig. 5C). GO analysis of these cycling gene sets 
showed enrichment for metabolism, cell com- 
munication, signaling, and circadian rhythms 
(Fig. 5, D and E, and data $3), which is consistent 
with the decline in average gene expression in 
metabolic pathways (Fig. 3D) and with anal- 
ysis of circadian gene targets using chromatin 
immunoprecipitation sequencing, in which 
the GO category metabolism was highly 
significant (32). 


Effects of day versus night CR on circadian 
gene expression 


Because the median life span of mice in the 
CR-night-2h and CR-day-2h groups was sig- 
nificantly different (1068 versus 959 days, re- 
spectively, and log-rank Mantel-Cox, P < 0.05), 
we focused on differential gene expression in 
these two CR groups. One essential difference 
in these two groups is the phase of food con- 
sumption relative to the LD cycle and relative 
to the circadian phase of the mice as assessed 
by their locomotor activity rhythms. We se- 
lected the sum total genes that showed circa- 
dian cycling in at least one of these four groups: 
CR-day-2h and CR-night-2h from young and 
old mice, which led to a total of 1491 cycling 
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genes. Figure 6A shows cycling genes in these 
four groups as a heatmap in which each line 
represents the color-coded levels of expression 
of one gene (in each row) across time points 
(columns). There was a slight reduction in 
cycling genes in young CR-day-2h mice, but in 
old CR-day-2h mice, there were only seven cy- 
cling genes. Circadian gene expression profiles 
of the six genes shown in Fig. 5C showed that 
the day-fed groups at 19 months of age have 
either opposite phases (Arntl, Nrid1, Gys2, 
Per2, and Pck]) or disrupted circadian profiles 
(Perl) (see Fig. 6B and fig. S13 for example cir- 
cadian profiles of pro-aging and pro-longevity 
genes). We compared the fold-change am- 
plitude of the 1491 genes in young and old 
mice. At both 6 and 19 months of age, the CR- 
night-2h groups had a significantly higher 
amplitude compared with the CR-day-2h 
groups, as seen in the fold-change frequency 
histograms and correlation plots (Fig. 6C). 
Thus, CR-night-2h feeding enhanced circadian 
amplitude relative to that of CR-day-2h. 

To assess the phase of entrainment in these 
four CR groups, we compared the phases of 
cycling genes relative to those of AL mice at 
6 and 19 months of age. Figure 6D shows cor- 
relation plots of genes that overlapped with 
AL mice in either the CR-day-2h or CR-night- 
2h groups at 6 and 19 months of age. The CR- 
night-2h mice shared many more cycling genes 
with AL mice than did the CR-day-2h mice at 
both ages. In addition, the phases of gene ex- 
pression of CR-night-2h mice of both ages were 
correlated with the phases of the genes from 
the respective AL age group, with the data 
points falling on the unity line. As expected, 
the CR-day-2h mice had opposite or diver- 
gent phases relative to those of AL mice. For 
the old CR-day-2h mice, only seven genes were 
scored as cycling and only five overlapped with 
those of AL mice. Thus, in CR-day-2h mice, 
there was a paucity of cycling genes and this 
declined precipitously with age. A similar 
reduction was seen in fold-change gene ex- 
pression in CR-spread mice compared with 
CR-night-2h mice at 6 and 19 months of age 
(fig. S14). The CR-night-2h groups showed ro- 
bust circadian cycling, even in the old mice, 
suggesting that this intervention is very effec- 
tive in rescuing circadian cycling of gene ex- 
pression relative to CR-day-2h. A proviso to 
this cycling analysis is that these cycling algo- 
rithms, JTK_CYCLE and ARSER, are biased 
toward sinusoidal waveforms, and although 
RAIN can search for spiky or sawtooth wave- 
forms, our requirement that a cycling gene is 
significant (FDR < 0.05) in all three algo- 
rithms results in only genes with sinusoidal 
waveforms passing this threshold. Daytime 
feeding and 2-hour feeding bouts both modified 
the gene expression time series waveforms 
so that they were less sinusoidal, and this 
contributed to the lower number of genes 
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Fig. 5. Circadian rhythms in liver gene expression are blunted during aging 
in AL mice. (A) Gene expression patterns from mRNA-seq were analyzed for 
circadian rhythms using the ARSER, JTK_CYCLE (from Metacycle R Package), 
and RAIN circadian algorithms. Heatmaps (top) are sorted by phase of gene 
expression. Each row is one gene with expression level in z-score at 12 time 
points (columns). Venn diagram (bottom) shows the number of rhythmic genes 
in young (gray) and aged (red) AL livers using stringent criteria (significantly 
cycling according to three algorithms; Benjamini-Hochberg P and q < 0.05 and 
logaFC > 0.3) to define rhythmicity. (B) Examples of circadian profiles of genes 


that are rhythmic in both young and aged AL livers. Black indicates young 

AL livers, and red indicates aged AL livers. (©) Comparison of phase (left, hours) 
and amplitude (right, daily fold change) of the 694 genes that were rhythmic 

in both age groups. The red correlation line (Spearman) and linear regression 
(slope is statistically different from 1; P < 2 x 107) in the fold change 
comparison indicates that aged animals showed overall reduced amplitude 

of rhythmic genes. (D and E) GO terms of genes that are cycling in either 
young (D) or aged (E) AL mice. Represented are 10 nonredundant terms of the 
top 25 most significant enriched terms. 


identified as cycling in the CR-day-2h group. 
Therefore, we do not recommend putting too 
much weight on the number of cycling genes 
called but rather emphasize the phase and 
amplitude of gene expression. 


Discussion 


Classic CR protocols not only reduce energy 
intake but also lead to severe behavioral time- 
restricted feeding behavior and prolonged 
fasting intervals (7-9). Because time-restricted 
feeding and fasting are beneficial to health, 
these two factors may contribute to life-span 
extension in classic CR experiments. In the 
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work presented here, we deconvolved the 
effects of calories, fasting, and circadian align- 
ment on longevity. We compared five different 
CR feeding groups that differed only in the 
daily pattern of food consumption without 
any changes in food composition or energy 
content. We used an automated feeding sys- 
tem (7) to test whether the timing and fasting 
period between meals affected life span in male 
mice under 30% CR by allowing food access 
only during the day, at night, or evenly dis- 
tributed throughout 24 hours. By spreading 
out food through 24 hours, with no day-night 
feeding pattern in the CR-spread group, we 
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found only an ~10% extension of median life 
span compared with AL-fed mice. The CR-day 
fed groups that had an ~12- or ~22-hour fast- 
ing interval lived ~20% longer than AL control 
mice. The degree of life-span extension was 
significantly longer when food was consumed 
during the nighttime, which is the normal feed- 
ing time in nocturnal rodents (~35% versus 
20% compared with AL, log-rank Mantel-Cox 
P< 0.0001, and ~10% night versus day, log- 
rank Mantel-Cox P < 0.05) (Fig. 2A). Although 
potential sleep disruption needs to be carefully 
studied in CR-day versus CR-night groups, re- 
cent evidence shows that sleep homeostasis is 
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Fig. 6. Effects of CR and phase of feeding on circadian gene expression. 


(A) Gene expression patterns from mRNA-seq were 


rhythms using the ARSER, JTK_CYCLE (from the Metacycle R Package), and 


RAIN circadian algorithms. Heatmaps sorted by pha 


Each row is one gene with expression level in z-score shown at 12 ti 


(columns). (B) Examples of the circadian profiles of 


in Fig. 5B comparing profiles from CR-night-2h (blue) to CR-day-2h (yellow) 


aged mice. (€) Comparison of circadian amplitude ( 


1491 rhythmic genes from young (left) and aged (right) CR groups. 


maintained with daytime feeding (29). All CR 
groups maintained a relatively steady body 
weight throughout their life span, indicating 
that the additive effects of CR combined with 
appropriate timing of food consumption were 
independent of weight gain. Thus, the maxi- 
mal pro-longevity benefits of CR can be achieved 
by a fasting interval of >12 hours in which a time- 
restricted feeding interval occurs in phase 
with the natural nocturnal circadian phase 
of feeding in mice (i.e., circadian alignment). 

These results are consistent with recent 
studies in C57BL/6J male mice in which sim- 
ilar life-span extensions were reported in 
once-per-day CR mice fed in the morning 
(20%, equivalent to our CR-day-2h group) (9) 
or mice on ~28% CR when food was consumed 
during the daytime but closer to the normal 
active phase at night (9 hours later than our 
CR-2h-day and 3 hours earlier than our CR-2h- 
night groups) (8). Pak et al. (9) suggested that 
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analyzed for circadian 
ression. 
me points 
the same genes shown 


se of gene exp 


daily fold-change) of 
Top panels 


the correlation plot. 


fasting alone drives the geroprotective effects 
of CR, which is partially consistent with our 
results; however, they argued that using a 50% 
cellulose-diluted diet that 30% CR does not 
extend life span and did not test whether the 
phase or circadian alignment of CR could be a 
factor. By contrast, we showed that 30% CR 
without dilution of the diet in the CR-spread 
group extended life span by ~10%; thus, we 
conclude that 30% CR alone without fasting 
or circadian alignment accounts for a 10% 
extension of life span. In our study, the diets 
used were identical in all six groups, which 
mitigates against the confounding effects of 
diet composition such as fiber (cellulose) and 
uncontrolled grain-based diets (23, 49, 50). 
Therefore, our conclusions differ from Pak et al. 
(9) in two ways: (i) we found that CR alone 
without fasting can still extend life span by 
~10% using the same diet and (ii) we found 
that fasting contributes in an additive man- 
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show amplitude density plots (median amplitude values are inset). Bottom 
panels are correlation plots comparing amplitude of genes from CR-night-2h 
to CR-day-2h in young (left) and aged (right) mice. The linear regression 
lines have slopes that are significantly <1 (P < 2 x 10°). (D) Phase 
correlation plots of rhythmic genes from young (left) and old (right) 
CR-night-2h-fed (blue) and CR-day-2h-fed (yellow) mice versus AL-fed mice 
of the same ages. Phase is represented in hours. Numbers of shared 
cycling genes between each CR condition and AL are labeled on top of 


ner to life-span extension rather driving the 
effects of CR. 

In nonhuman primates, the effects of CR on 
longevity have differed between studies per- 
formed at the University of Wisconsin (UW) 
and the National Institute of Aging (NIA). 
However, many differences in study design, 
diet, and feeding protocols have been docu- 
mented, and the overall conclusions from a 
joint consensus is that CR improves health 
and survival in rhesus monkeys (57-53). CR 
did not extend life span in the NIA study; 
however, the AL control group was longer 
lived, slightly caloric restricted, food was pro- 
vided twice a day, and body weights were 
lower than AL controls in the UW study. In 
addition, in the UW study, food was pro- 
vided AL during the day from ~8:00 a.m. to 
~4:00 p.m., but was removed during the night. 
Thus, in both studies, the AL groups were 
either partially calorie restricted (NIA) or 
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were under a time-restricted feeding proto- 
col (UW), which should be beneficial. Thus, it 
would be of considerable interest to explore 
the effects of time-restricted feeding and cir- 
cadian phase of CR in future studies in non- 
human primates. 

The benefits of circadian alignment of feed- 
ing and fasting cycles are widespread across 
species. It increases life span in flies (54), 
protects against metabolic disorders in mice 
(7, 18, 55, 56), and promotes health in humans 
(57, 58). We also found that CR promotes 
widespread metabolic benefits, which include 
lower body weight with reduced fat content. 
Furthermore, CR (particularly the longest- 
lived group, CR-night) attenuated age-related 
changes observed under AL by improving glu- 
cose homeostasis, insulin sensitivity, and hor- 
monal profiles. 

Using circadian gene profiling, we found 
that timing of food intake led to complex 
genome-wide reprogramming of circadian 
gene expression in the liver, which is con- 
sistent with other findings (11, 14, 33, 59-61). 
This emphasizes the importance of consid- 
ering the sampling time, which is often a 
snapshot, before concluding whether any in- 
tervention increases or decreases the expres- 
sion of a gene of interest. If a gene has a 
circadian oscillation, then a single snapshot 
could lead to the opposite conclusions depend- 
ing on the time it was taken, particularly in 
classic CR protocols resembling our CR-day-2h 
group (J, 7). 

Aging was associated with increased expres- 
sion of the genes involved in immune processes 
and inflammation and decreased expression of 
the genes involved in metabolism and circadian 
biology. CR treatment restored or attenuated 
many of these age-related changes in gene 
expression in a manner similar to results 
reported by the de Cabo (24, 34) and Sassone- 
Corsi (4) groups. Decreases in circadian ampli- 
tude of gene expression with age were reduced 
by CR. Overall, feeding time and fasting had 
additive effects on CR-mediated life-span ex- 
tension. Together, CR and time restriction of 
feeding to the nighttime optimally extended 
life span and delayed many of the age-related 
gene expression changes in immune function, 
inflammation, and metabolism. Further studies 
are required to determine whether disruption 
of circadian sleep/wake cycles by daytime feed- 
ing also contributes to the day versus night 
differential responses in the liver transcriptome 
and life span. Thus, circadian interventions 
such as timed feeding can enhance the well- 
known life-span benefits of CR. 

There are many links between the circa- 
dian system and aging (20). High-amplitude 
circadian rhythms correlate with well-being 
(13, 62, 63), whereas clock dysfunction leads 
to metabolic disorders, premature aging, and 
reduced life span (42, 64-67). During normal 
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aging, rhythms decrease in amplitude and 
often exhibit a shift in phase (74, 20, 68-72). 
In both flies and mice, the effects of dietary 
restriction on life span require core circadian 
clock genes (73-75). The circadian system reg- 
ulates most of the metabolic pathways impli- 
cated in longevity (3, 4, 13, 22, 32, 59, 76, 77). 
Molecules known to function in the regula- 
tion of life span by dietary restriction, such as 
insulin and insulin-like growth factor 1 (IGF-1), 
Sirtuin1 (SIRTD, nicotinamide phosphoribosyl- 
transferase (NAMPT), AMP-activated protein 
kinase (AMPK), PPARy coactivator 1 (PGC-1a), 
mechanistic target of rapamycin (mTOR), and 
glycogen synthase kinase 38 (GSK3B8), are all 
intricately involved in the molecular mecha- 
nisms of circadian clocks (77-89). The master 
circadian transcription factors CLOCK and 
BMALI have direct gene targets in every fun- 
damental metabolic pathway in the liver 
(25, 32, 90, 91). Because of these direct links 
among the pathways involved in aging and 
longevity, metabolism, and the circadian clock, 
our results demonstrate the importance of 
timing of CR and indicate that optimizing the 
phase of circadian gene expression may be a 
powerful intervention for extending life span. 

We used C57BL/6J male mice in this study. 
However, there could be strain- and sex-specific 
responses worth studying further (24, 92), be- 
cause, for example, ovarian hormones can 
protect females against dietary challenges 
that otherwise disrupt circadian rhythms in 
males (55). In future work, both sexes and 
multiple genetic backgrounds (92-94) should 
be used to explore the broader effects of cir- 
cadian interventions on aging, and the results 
may support the application of circadian-timed 
interventions in human studies. 
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DNA viruses are increasingly recognized as influencing marine microbes and microbe-mediated 
biogeochemical cycling. However, little is known about global marine RNA virus diversity, ecology, 
and ecosystem roles. In this study, we uncover patterns and predictors of marine RNA virus 
community- and “species”-level diversity and contextualize their ecological impacts from pole to 
pole. Our analyses revealed four ecological zones, latitudinal and depth diversity patterns, and 
environmental correlates for RNA viruses. Our findings only partially parallel those of cosampled 
plankton and show unexpectedly high polar ecological interactions. The influence of RNA viruses on 
ecosystems appears to be large, as predicted hosts are ecologically important. Moreover, the 
occurrence of auxiliary metabolic genes indicates that RNA viruses cause reprogramming of diverse 
host metabolisms, including photosynthesis and carbon cycling, and that RNA virus abundances 


predict ocean carbon export. 


he Global Ocean is dominated by plank- 

ton communities that are essential to 

sustain life on Earth. Plankton are at the 

base of the food web for marine and ter- 

restrial organisms and drive planetary 
biogeochemical cycles (/, 2). Because nearly 
half of Earth’s primary production derives from 
ocean plankton, carbon cycling and biodiversity 
studies have long been a focus in oceanography 
(3). In addition, marine plankton are central 
to the biological carbon pump because their 
activity determines whether dissolved carbon 
dioxide is assimilated into biomass that can 
be sequestered to the deep ocean or recycled 
in surface waters and likely released to the 
atmosphere (4, 5). Thus, understanding ocean 
biodiversity, carbon export, and related chem- 
ical transformations is critical to predict- 
ing the changing role of the ocean in the 
Anthropocene. 

Plankton are susceptible to virus infection. 
Double-stranded DNA (dsDNA) viruses have 
been increasingly recognized as major eco- 
system players (6), whereas RNA viruses have 
been less well-studied owing to methodological 
challenges (7). It is clear, however, that marine 


RNA viruses are likely important in marine 
ecosystems, as they (i) are abundant (8, 9), 
(ii) infect protists and invertebrates that are 
central to ocean biogeochemical cycling (70), 
and (iii) have been statistically associated 
with termination of algal blooms (//, 72) and 
modulation of host diversity (73). Despite lit- 
erature increasingly presenting RNA viruses 
as a likely major force behind biogeochemistry 
(6, 14, 15), empirical data are challenging to 
obtain. Recent sequencing surveys, including 
from the oceans, have identified thousands of 
previously unknown RNA viruses that constitute 
genus- or subfamily-rank taxa (J6-18) as well as 
phylum-rank taxa (9). However, research on the 
ecology of RNA viruses has been limited to small 
spatial scales among pelagic waters and/or 
viruses associated with larger plankton of a few 
species (table S1). This lack of ecological context, 
particularly over large scales, limits the incor- 
poration of RNA viruses into predictive models. 

Previously, we analyzed 771 metatranscrip- 
tomes (provided by Tara Oceans Expeditions) 
that span diverse ocean waters, depths, orga- 
nismal size fractions, and sequencing library 
approaches (Fig. 1A, fig. S1, table S2 for sam- 


ple metadata, and materials and methods) to 
identify and quantify RNA viruses (19). This 
effort led to the identification of 44,779 RNA 
virus contigs that were dereplicated to 5504 
“species”-level virus operational taxonomic 
units (vOTUs), for which we established tax- 
onomy, evolutionary origins, and biogeography. 
In this work, we leverage these data to gen- 
erate and test several existing hypotheses about 
RNA virus diversity and their ecological roles 
throughout the Global Ocean. 


RNA virus meta-community analyses reveal 
distinct ecological zones 


Given the importance of marine plankton (2), 
scientists have long sought to understand 
their ecological patterns and drivers through 
space and/or time. Temporal studies have 
revealed seasonal-, depth-, and nutrient-related 
local or regional drivers of plankton species 
diversity and community composition, whereas 
systematic surveys sought to examine these 
ecological patterns and drivers on a global 
scale (table S3). However, none of these global 
studies included RNA viruses. Hence, we 
used our previously generated RNA vOTUs 
(19), preclustered at 90% average nucleotide 
identity across 80% of the shorter sequence 
length and 1-kb minimum contig length 
(materials and methods), and their relative 
abundances, estimated by means of meta- 
transcriptomic read mapping (materials and 
methods), to investigate marine RNA virus 
ecology globally. 

By using a statistical method that non- 
linearly deconvolutes high-dimensional data 
into two-dimensional space (Fig. 1B; ¢-distributed 
stochastic neighbor embedding, fig. S2, A to 
C) and classical hierarchical clustering tech- 
niques (fig. S2D) on Bray-Curtis dissimilarity 
matrices of RNA vOTU relative abundances 
(materials and methods), we show that Global 
Ocean RNA virus communities can be assigned 
to four ecological zones: Arctic, Antarctic, Tem- 
perate and Tropical Epipelagic, and Temperate 
and Tropical Mesopelagic. This classification 
into only four ecological zones contrasts with 
the 56 biogeochemical provinces that are clas- 
sically described for the surface oceans, where 
nutrients and primary productivity drive plank- 
ton community composition (20). However, the 
four ecological zone assignments are nearly 
identical (115 of 118 shared samples) to those 
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Fig. 1. The cross-domain Global Ocean plankton sampling and resultant 
RNA virus meta-communities identified from the metatranscriptomes. 
(A) Global Ocean sampling map shows the cruise of the Tara Oceans and Tara 
Oceans Polar Circle expeditions and the location of their stations, which are 
shown with green and white shapes, respectively. Down-pointing triangles 
indicate stations from where dsDNA viromes were previously collected. Up- 
pointing triangles, squares, and circles show stations with samples of 
prokaryote-enriched size fractions, eukaryote-enriched size fractions, and both, 
respectively. The upper blowout panel shows a graded arrow that represents a 
logarithmic scale of the plankton organismal size fractions captured in this 
study. The four operational size fractions (piconanoplankton, nanoplankton, 
microplankton, and mesoplankton) are indicated by the top colored bars and 
are classified as “prokaryote-enriched” or “eukaryote-enriched” size fractions 
(highlighted by the bottom gradient-colored bars). Such categories, despite 
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being enriched in a type of organism, do not exclude other types. Thus, 
prokaryote-enriched samples could contain giant viruses and picoeukaryotes, 
and eukaryotic holobionts of eukaryote-enriched samples could harbor 
prokaryotes or viruses either as symbionts or food. A picture of the research 
vessel Tara is included as well. (B) Statistical analysis [t-distributed stochastic 
neighbor embedding (t-SNE)] of a Bray-Curtis dissimilarity matrix that was 
calculated from all RNA virus sequence samples in this study regardless of size 
fraction or library preparation method. Dot colors follow the legend shown in 
(C) (also see figs. S4 and S5 for vOTU definition sensitivity analyses). 

(C) Regression analysis of the first coordinate of a principal coordinate analysis 
(PCol) of the same Bray-Curtis dissimilarity matrix in (A) (also see fig. S2) and 
temperature, which shows that samples across all the size fractions were 
separated by their local temperatures with an r of 0.74 (P values = 0). ANT, 
Antarctic; ARC, Arctic; TT, Temperate and Tropical. 
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Fig. 2. RNA and DNA virus “species”-level diversity 
show large-scale congruence. (A and B) Boxplot (A) 
and regression (B) analyses of RNA and DNA virus 
“species’-level diversity across their shared ecological 
zones. Shannon's H values were mean-centered and 
rescaled across the two virus nucleic acid types for 
visual comparisons. All boxplots show medians 

and quartiles. The medians of each boxplot were used 
for direct regression analysis. Statistical support 
(Tukey honest significant differences method on an 
analysis of variance) is indicated in the figure as 
follows: «adjusted P < 0.05, «*adjusted P < 0.01, and 
«xaxadjusted P < 0.000001. Only RNA viruses from 
the prokaryotic fraction were used (see fig. S3 for 
comparison with the eukaryotic fractions) as this 
fraction showed the smallest library preparation biases 
(fig. Sl and materials and methods). ANT, Antarctic; 
ARC-H, Arctic high diversity; ARC-L, Arctic low 
diversity; TT_EPI, Temperate and Tropical Epipelagic; 
TT_MES, Temperate and Tropical Mesopelagic. 


that were inferred for prokaryotic dsDNA 
viruses (materials and methods; the fifth 
Bathypelagic zone that was inferred from 
dsDNA virus analyses was not sampled here) 
(21) and largely parallel to those from broader 
Tara Oceans Consortium work on prokaryotes 
(22). Before this study, these ecological zone 
analyses had not been performed for eukaryotes 
or eukaryotic RNA viruses. Also previously, 
transport or migration of eukaryotic plank- 
ton across ocean surface biomes and layers 
was thought to erode the boundaries between 
these ecological zones (23). Our and other 
recent eukaryotic data (24) challenge this 
hypothesis. 

Investigation of ecological parameters that 
potentially drive community structure at large 
scale revealed that temperature alone could 
explain most RNA virus community compo- 
sition variation along the first ordination axis 
(Fig. 1C). Other ecological drivers, including 
oxygen, depth, and nutrient availability, may 
shape plankton community composition (table 
$3, Al0 to Al4), but these often co-vary with 
temperature. Limited sampling in these previ- 
ous, geographically constrained studies led to 
the hypothesis that depth is the main driver of 
plankton community composition. With global 
data now available, it is apparent that temper- 
ature variance potentially drives stratification 
in nonpolar regions (fig. S2, E and F) and 
selects for cold-adapted communities in polar 
regions. A temperature-driven RNA virus com- 
munity composition complements that for 
dsDNA viruses (21), prokaryotes (22), eukary- 
otes (24), and their interactions (25). 


Differential predictors of RNA virus global and 
local “species”-level diversity 


Comparison of the diversity patterns of RNA 
(this study) and dsDNA (27) viruses revealed 
highly concordant large-scale patterns, includ- 
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ing previously identified (27) high- and low- 
diversity regions of the Arctic Ocean (ARC-H 
and ARC-L) (Fig. 2). However, local diversity 
comparisons (i.e., per-sample comparisons) 
showed that the concordance, despite being 
significant (P < 0.02), was modest (7 = 0.25 per 
each Pearson’s and Spearman’s tests), which 
suggests that small-scale diversity drivers may 
differ for DNA and RNA viruses. When exam- 
ining the large suite of environmental varia- 
bles available for our samples (table S4) for 
possible correlations with RNA and dsDNA 
virus diversity, we accounted for collinearity 
using a systems biology network analysis frame- 
work to reduce environmental factor dimen- 
sionality into fewer environmental “modules” 
(Fig. 3 and materials and methods). 

We found, first, that similar to dsDNA viruses 
(21), temperature (cyan module in Fig. 3) was 
not the best predictor of RNA virus diversity. 
Instead, nutrients (white module in Fig. 3) were 
prominent predictors of species diversity for 
both dsDNA and RNA viruses, along with 
other signatures of primary productivity (violet 
module in Fig. 3). Second, in our previous 
study on dsDNA viruses (21), we showed that 
the link between dsDNA virus diversity and 
nutrients might be through primary produc- 
tivity, because photosynthetic coccolithophores’ 
abundance and particulate inorganic carbon 
(PIC) concentration covaries with dsDNA virus 
diversity (ight green module in Fig. 3). More 
recently, the relationship between dsDNA 
viruses and PIC has been posited to be abiotic 
on the basis of direct virus-mediated mineral 
precipitation (26). Unlike dsDNA virus diver- 
sity, RNA virus diversity does not correlate 
with the PIC module but does still correlate 
with primary productivity pigment concen- 
trations such as chlorophyll b (yellow module 
in Fig. 3), which indicates, as expected, that 


dsDNA and RNA viruses infect different hosts. 
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This and other biological features of RNA vi- 
ruses, such as their shorter and faster-evolving 
genomes, higher burst sizes, lytic lifestyles, and 
eukaryotic hosts, are hypothesized to drive virus- 
host interaction and ecosystem impact differ- 
ences from dsDNA viruses (27). Models that 
are based on known RNA virus biological fea- 
tures also lend support to this idea (6, 7, 27, 28). 
We interpret the small-scale differences in di- 
versity patterns, despite high concordance at 
the large scale, as also deriving from varied 
biological features across RNA and dsDNA 
viruses. 

Together, these findings indicate that the 
underlying large-scale potential drivers for 
virus community composition (which encom- 
passes the identity and abundance of vOTUs) 
and species diversity (which encompasses the 
vOTUs’ richness and distribution evenness) 
act similarly for the RNA viruses of eukaryotes 
and the dsDNA viruses of prokaryotes. For 
virus community composition, perhaps this is 
not surprising, given that likely host commu- 
nity compositions (planktonic prokaryotes and 
microbial eukaryotes) also appear to be mainly 
driven by temperature (22, 24, 29). For virus 
diversity, the relationship with host diversity 
can be more complex (see “RNA virus ‘species’- 
level diversity along ecological gradients”). Lo- 
cally, the varying biological features of RNA 
viruses are hypothesized (7, 28) to drive virus- 
host interaction and ecosystem impact differ- 
ences between largely prokaryotic dsDNA 
viruses and eukaryotic RNA viruses. For local 
diversity predictors, our findings are consistent 
with this hypothesis. 


RNA virus “species”-level diversity along 
ecological gradients 


The physicochemical tolerances, or ecological 
gradients, of RNA viruses are not understood. 
Organismal diversity typically decreases with 


3 of 7 


RESEARCH | RESEARCH ARTICLE 


Optical 


backscattering Optical beam 


Angular Backscattering 


scattering coefficient 
7 Carbonate 
Bacteriochlorophyll a ion 19-Butanoyloxyfucoxanthin 
Alloxanthin 
TW 19-Hexanoyloxyfucoxanthin 
Pearson's r. | Peridinin 
mm >05 Chior a total Chlor b 
mmm 0.2--0.5] Phaeophorbide a Neoxanthin 
— <0.2 | Chloretc2 F 
Lutein Solred ion 
é : Olored text: 
p-value: Diadinoxanthin Oi O Chior b & ; 
tJ < 0.001 Silicate @) divinyl chlor b Highest correlation with 
: Carotene- e, Nitrate Oxygen MLE Total Alkalinit DNA or RNA viruses 
@ <001 Phaeophytina — Zeaxanthin F Chior a areal A i Nitracline depth Seta 
© <0.05 Total Carbon Nitrate & R aeeee te 5m Diet f Pp t 
Fucoxanthin Salinit Sea Ice- Nitrite esi oe time istance from coast ——_—_—_— 
@ >0.05 ly. free period [Latitude] ron 5m Brunt V max depth 0. 1.0 
Pearson's correlation with Shannon Depth Sunshine duration  WC2KlY Supported module | WGCNA intramodular correlations 


DNA viruses RNA viruses 


Beam attenuation Aragonite saturation state 


attenuation 


PIC 8 days Violaxanthin 


POC 8 day 
Fluorescence 


Temperature Divinyl chlor a Phosphate 


Fig. 3. “Species”-level diversity correlates of marine RNA viruses. Weighted __ were statistically supported by both Pearson's and Spearman's tests. Only RNA 
gene correlation network analysis (WGCNA)-supported modules (to account for —__ viruses from the prokaryotic fraction were used (see Fig. 2 for explanation). 
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indicated by organism silhouettes in each section. Inferred plants were 
interpreted as their closest relatives, chlorophytes (green algae), in the 
marine environment. Bacteria were inferred from picobirnavirids. Annotated 
proteins associated with multiple, disparate cellular processes or whose 
function remains obscure are not shown (see annotation details for 
corresponding vOTUs and virus contigs in table S6). ABC, ATP-binding 
cassette; TRAP, tripartite ATP-independent periplasmic. 


preserved in cold temperatures and/or (ii) more 
viruses of distinct species to interact with the 
same host organism in polar waters. The former 
hypothesis has some support in literature (36), 
whereas the latter is untested. 

To test the latter hypothesis, we built an 
abundance-based co-occurrence network that 
integrated RNA viruses, prokaryotes, and eu- 
karyotes (materials and methods) to predict 
hosts for these RNA viruses [sensu (25)]. As- 
suming that the overall topology of the network 
is relatively representative, even if specific pre- 
dictions are not accurate (see the predicted 
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hosts section below), we compared the average 
number of connections per taxon (i.e., mean 
degree) in polar and nonpolar samples. This 
comparison showed significantly more con- 
nections in polar samples than nonpolar sam- 
ples, and this feature was solely driven by RNA 
viruses (Fig. 4C). This result was unexpected 
but is in line with a recent ecological network 
theory prediction that used data from 511 
mammal-infecting viruses to show a nonlinear 
relationship between host and virus diversity 
(37), which was interpreted to be a result of 
host sharing among different sets of viruses of 
separate species. 

Hence, although the ecological zones and 
potential ecological drivers of marine RNA 
viruses (Fig. 1, B and C) and their expected 
eukaryotic hosts (24) were similar in our data- 
sets, the species diversity relationships of RNA 
viruses and their hosts can be more complex on 
a global scale. 


Marine RNA viruses and inferred local and 
global ecological impact 


First, we sought to place RNA virus diversity 
data into an ecosystem context by assessing 
local- to global-scale impacts by means of in- 
fected plankton hosts or altered metabolisms 
(local scale) versus systems-level ecosystem im- 
pact (global scale). We predicted hosts for our 
vOTUs using three approaches: (i) host infor- 
mation available for viruses of established taxa, 
(ii) abundance-based co-occurrence, and/or 
(ii) endogenous virus element (EVE) signa- 
tures (fig. S6). Although these results provide 
only broad taxon rank host predictions, as in 
silico host inferences for RNA viruses are not 
well-established, they indicated infection of 
diverse organisms of ecological interest, pre- 
dominantly protists and fungi, and, to a lesser 
extent, invertebrate metazoans (table S5). We 
also explored alternative eukaryotic genetic 
codes for host prediction, which revealed 11 
known alternative, eukaryotic genetic codes 
in 6.8% of the vOTUs and indicated microbial 
eukaryotes (including mitochondria of yeast, 
mold, protozoans, and chlorophyceans and 
nuclear codes of several ciliates) and meta- 
zoans (mitochondria of invertebrates) as puta- 
tive hosts (table S5). Notably, these inferred 
hosts are associated with diverse ecologi- 
cal functions, including phototrophy (e.g., 
bacillariophytes), phagotrophy (e.g., ciliates), 
mixotrophy (e.g., dinophyceaens), saprotrophy 
(e.g., ascomycetes), parasitism (e.g., alveolates), 
grazing (e.g., arthropods), and filter feeding 
(e.g., annelids). Furthermore, several of these 
hosts, including certain invertebrate metazoans 
and particularly protists and fungi, are also 
recognized as critical contributors to the bio- 
logical carbon pump. Although host prediction 
is challenging, these findings add support to 
prior work at smaller scales (table S1) that indi- 
cate that RNA viruses are central ecological 
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players in the oceans. These findings also indi- 
cate that, although prokaryotic cells outnumber 
eukaryotic organisms in the oceans, few RNA 
viruses (only 3.4% of the vVOTUs) infect bacteria, 
a result that is consistent with previous marine 
virome and virus isolate reports (7). 

Second, ecosystem impact might be inferred 
from the “cellular” protein sequences that we 
identified in the RNA virus genomes, which we 
posited may parallel the “auxiliary metabolic 
genes” (AMGs) that are ecologically important 
in marine prokaryotic dsDNA viruses (38). Al- 
though such cellular protein sequences are 
uncommon in RNA virus genomes, either as 
independent open reading frames or as parts 
of larger virus proteins, we found 72 function- 
ally distinct AMGs in 95 vOTUs (table S6). 
Together, these may hint at how RNA viruses 
manipulate host physiology to maximize virus 
production (Fig. 5). Although chimeric assem- 
blies might artifactually link AMGs to virus 
RNA-directed RNA polymerases (RdRP) se- 
quences, several lines of evidence argue against 
this possibility: (i) 15 AMG-RdRP linkages 
were observed at multiple sampling sites (Fig. 
5), and (ii) even though RNA viruses are rarely 
represented in metatranscriptomes (16), long- 
read sequencing captured three AMG-RdRP 
linkages (data S1). In addition, no AMGs were 
present in any of the 14 virus contigs that were 
putatively derived from EVEs (data S2 and 
materials and methods). Mechanistically, we 
presume such AMGs were acquired by RNA 
virus genomes through copy-choice recombi- 
nation with cellular RNAs, as was originally 
suggested for ubiquitin in togaviruses (39). 
We identified 12 previously reported cases 
of such RdRP-linked AMGs, but only three 
studies assessed their functional context in 
virus infection (table S6). Thus, we used this 
larger dataset to explore the possible biology 
that such AMGs might offer to RNA viruses 
and ecosystems. 

Functionally, the 72 AMG types were diverse, 
with only four cases overlapping with the 12 
previously reported AMGs in RNA virus ge- 
nomes (table S6 and data S1). The most common 
functional type of AMG (15.8%) was involved 
in RNA modifications (RtcB, AIkB, and RNA 
2'-phosphotransferase) and posttranslational 
modifications (NADAR and OARD1), which 
may reflect the common need of viruses to 
evade host antiviral responses through the 
repair of virus RNAs and proteins (40, 41). 
Given that viruses must reprogram cells toward 
virus progeny production and that RNA viruses 
have relatively short genomes, it was not 
surprising to see that protein kinases were 
abundant (14.8%), as they would allow broad 
reprogramming capability through limited 
genetic capacity. The frequency of AMGs sug- 
gested that a suite of other processes are 
affected by marine RNA viruses, including 
carbohydrate metabolism (10.9%), translation 
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(8.9%), nutrient transport (7.9%), photosynthesis 
(5.9%), and vacuolar digestion (4.0%). We posit 
that many of these AMGs represent ocean- 
specific RNA virus adaptations that help cellular 
“virus factories” maximize output in the often 
ultralimiting nutrient conditions of seawater. 

Recent experimental work has emerged to 
assess how DNA viruses affect ocean carbon 
export over small scales (42, 43). We sought 
to complement these efforts through Global 
Ocean assessment of RNA viruses by using 
previously developed machine learning and 
ecosystem modeling approaches (materials and 
methods) (J0) to evaluate in silico whether 
RNA viruses might affect ocean carbon export. 
This revealed that RNA virus abundances 
were strongly predictive of ocean carbon flux 
and identified specific VOTUs that were most 
significant for these predictions (fig. S7 and 
table S7). Specifically, from 5504 vOTUs, 1,243 
were identified as part of four highly signifi- 
cant subnetworks (P values = 0) of RNA viruses 
that strongly predicted carbon flux variation 
(fig. S7A). Notably, subnetwork-specific topology 
interrogation by partial least-squares regression 
modeling and leave-one-out cross-validation 
techniques (materials and methods) showed 
that these subnetworks represent predictive 
community biomarkers for carbon export (cross- 
validated 7” up to 0.79, and, critically, in a 1:1 
ratio, which implies capturing the correct mag- 
nitude in the models) (fig. S7A). Further, these 
techniques very conservatively identified 11 RNA 
viruses that were most predictive of carbon flux 
(ie., VIP score) (table S7 and fig. S7B) and offer 
ideal targets for follow-on hypothesis testing. 
Chlorophytes and haptophytes could be assigned 
as hosts for two of these viruses (fig. S7B). These 
algal hosts are thought to be critical components 
in the biological carbon pump (table $3, A17 
to A19). 


Conclusions 


For decades, extensive studies have focused on 
plankton dynamics and activity to infer the 
pairwise links among plankton and carbon 
export, including recent experimental work 
with viruses (42, 43). Because these seminal 
studies were focused on narrow geographic 
ranges or oceanic provinces, we sought here 
to instead explore Global Ocean signals by 
taking advantage of the uniform Tara Oceans 
strategy for sampling plankton and sinking 
particles to broadly investigate oceanic con- 
ditions and ecosystem biota (10). Hence, although 
limited by single time points or “snapshot” 
sampling, combining these measurements with 
a robust statistical framework (i.e., network- 
based, cross-validated, multivariate-aware cor- 
relation analysis) enables statistical exploration 
to establish hypotheses about key ecosystem 
players. For this, we can leverage the context of 
hypothesized interactions (25) instead of using 
the more traditional pairwise correlations (e.g., 
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of amember of specific taxon and an ecosystem 
output) from classical studies. 

Notably, previous Tara studies have revealed 
prokaryotic and eukaryotic DNA virus abun- 
dances to provide biological proxies for esti- 
mating carbon export (J0, 44), and one even 
identified eukaryotic virus abundances as pre- 
dictive for carbon export efficiency (44). How- 
ever, the RNA virus diversity and abundance 
analyses presented here represent major ad- 
vances: (i) our ecological unit and abundance 
calculation methods (from contigs to high- 
quality genomes) were extensively evaluated 
and found to be robust and suitable for sen- 
sitive ecological analyses (figs. S4 and S5); (ii) 
our analyses were composed purely of RNA 
viruses because of capturing 25-fold more data 
that are not dominated by eukaryotic dsDNA 
viruses; and (iii) our analyses included polar 
waters, which are critical for carbon export 
(fig. S8). Together, these findings provide a 
roadmap for studying RNA viruses in nature, 
as well as evidence that RNA viruses play im- 
portant roles in the ocean ecosystem. 
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Quantum optimization of maximum independent set 


using Rydberg atom arrays 
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V. Vuletié!2*, M. D. Lukin?* 


Realizing quantum speedup for practically relevant, computationally hard problems is a central challenge 
in quantum information science. Using Rydberg atom arrays with up to 289 qubits in two spatial 
dimensions, we experimentally investigate quantum algorithms for solving the maximum independent 
set problem. We use a hardware-efficient encoding associated with Rydberg blockade, realize 
closed-loop optimization to test several variational algorithms, and subsequently apply them to 
systematically explore a class of graphs with programmable connectivity. We find that the problem 
hardness is controlled by the solution degeneracy and number of local minima, and we experimentally 
benchmark the quantum algorithm's performance against classical simulated annealing. On the 
hardest graphs, we observe a superlinear quantum speedup in finding exact solutions in the deep circuit 


regime and analyze its origins. 


ombinatorial optimization is ubiquitous 

in many areas of science and technology. 

Many such problems have been shown 

to be computationally hard and form 

the basis for understanding complexity 
classes in modern computer science (7). The 
use of quantum machines to accelerate solving 
such problems has been theoretically explored 
for over two decades with a variety of quan- 
tum algorithms (2-4). Typically, a relevant cost 
function is encoded in a quantum Hamiltonian 
(5), and its low-energy state is sought starting 
from a generic initial state, either through an 
adiabatic evolution (2) or a variational ap- 
proach (3), via closed optimization loops (6, 7). 
The computational performance of such al- 
gorithms has been investigated theoretically 
(4, 8-13) and experimentally (14-16) in small 
quantum systems with shallow quantum cir- 
cuits, or in systems lacking the many-body 
coherence believed to be central for quantum 
advantage (17, 18). However, these studies offer 


Department of Physics, Harvard University, Cambridge, MA 
02138, USA. ?QuEra Computing Inc., Boston, MA 02135, 
USA. *Department of Physics and Astronomy, University of 
Waterloo, Waterloo N2L 3G1, Canada. “Perimeter Institute for 
Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada. 
5School of Engineering and Applied Science, Harvard 
University, Cambridge, MA 02138, USA. Google Quantum Al, 
Venice, CA 90291, USA. /Center for Theoretical Physics, 
Massachusetts Institute of Technology, Cambridge, MA 
02139, USA. °School of Natural Sciences, Institute for 
Advanced Study, Princeton, NJ 08540, USA. °Walter Burke 
nstitute for Theoretical Physics, California Institute of 
Technology, Pasadena, CA 91125, USA. !Institute for 
Theoretical Physics, University of Innsbruck, A-6020 
nnsbruck, Austria. “Institute for Quantum Optics and 
Quantum Information, Austrian Academy of Sciences, 

A-6020 Innsbruck, Austria. Department of Physics and 
Research Laboratory of Electronics, Massachusetts Institute 
of Technology, Cambridge, MA 02139, USA. 

*Corresponding author. Email: greiner@physics.harvard.edu (M.G.); 
vuletic@mit.edu (V.V.); lukin@physics.harvard.edu (M.D.L.) 

These authors contributed equally to this work. +Present address: 
AWS Center for Quantum Computing, Pasadena, CA 91125, USA. 


Ebadi et al., Science 376, 1209-1215 (2022) 


only limited insights into algorithms’ per- 
formances in the most interesting regime 
involving large system sizes and high circuit 
depths (19, 20). 

Here we use a quantum device based on co- 
herent, programmable arrays of neutral atoms 
trapped in optical tweezers to investigate quan- 
tum optimization algorithms for systems rang- 
ing from 39 to 289 qubits, and effective depths 
sufficient for the quantum correlations to 
spread across the entire graph. Specifically, 
we focus on maximum independent set, a 
paradigmatic NP-hard optimization problem 
(21). It involves finding the largest indepen- 
dent set of a graph—a subset of vertices such 
that no edges connect any pair in the set. An 
important class of such maximum indepen- 
dent set problems involves unit disk graphs, 
which are defined by vertices on a two- 
dimensional plane with edges connecting all 
pairs of vertices within a unit distance of one 
another (Fig. 1, A and B). Such instances arise 
naturally in problems associated with geomet- 
ric constraints that are important for many 
practical applications, such as modeling wire- 
less communication networks (22, 23). Al- 
though there exist polynomial-time classical 
algorithms to find approximate solutions to 
the maximum independent set problem on 
such graphs (24), solving the problem exactly is 
known to be NP-hard in the worst case (23, 25). 


Maximum independent set on Rydberg 
atom arrays 


Our approach uses a two-dimensional atom 
array described previously (26). Excitation 
from a ground state |0) into a Rydberg state 
|1) is utilized for hardware-efficient encod- 
ing of the unit disk maximum independent 
set problem (27). For a particular graph, we 
create a geometric configuration of atoms 
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using optical tweezers such that each atom 
represents a vertex. The edges are drawn 
according to the unit disk criterion for a unit 
distance given by the Rydberg blockade radius 
Ry (Fig. 1C), the distance within which excita- 
tion of more than one atom to the Rydberg 
state is prohibited because of strong interac- 
tions (28). The Rydberg blockade mechanism 
thus restricts the evolution primarily to the 
subspace spanned by the states that obey the 
independent set constraint of the problem 
graph. Quantum algorithms for optimization 
are implemented via global atomic excitation 
using homogeneous laser pulses with a time- 
varying Rabi frequency (and a time-varying 
phase) Qe” and detuning A(é) (Fig. 1D). 
The resulting quantum dynamics is governed 
by the Hamiltonian A = Hy + Aeost, With the 
quantum driver H, and the cost function Heost 
given by 


h 


Hg = 5). [2@e*0),a| +h], 


L 
Fost = —hA(t)) > Ny + ty Vignin; (1) 
i i<j 


where n; = |1),(1|, and Viz = Vo/(Ir; - 7)1)° is 
the interaction potential that sets the block- 
ade radius R,, and determines the connectivity 
of the graph. For a positive laser detuning A, 
the many-body ground state of the cost func- 
tion Hamiltonian maximizes the total num- 
ber of qubits in the Rydberg state under the 
blockade constraint, corresponding to the 
largest independent set MIS(G) (hereafter 
MIS) of the underlying unit disk graph G (27) 
(Fig. 1E). Even with the finite blockade energy 
and long-range interaction tails, we empirically 
find that the ground states of Ho, still encode 
an MIS for the ensemble of graphs studied 
here [see (25, 27)]. 


Variational optimization via a closed 
quantum-classical loop 


In the experiment, we deterministically pre- 
pare graphs with vertices occupying 80% of 
an underlying square lattice, with the block- 
ade extending across nearest and next-nearest 
(diagonal) neighbors (Fig. 1C). This allows us 
to explore a class of nonplanar graphs for 
which finding the exact solution of MIS is 
NP-hard for worst-case instances (25). To 
prepare quantum states with a large overlap 
with the MIS solution space, we use a family of 
variational quantum optimization algorithms 
using a quantum-classical optimization loop. 
We place atoms at positions defined by the 
vertices of the chosen graph, initialize them in 
state |0), and implement a coherent quantum 
evolution corresponding to the specific choice 
of variational parameters (Fig. 1D). Subse- 
quently, we sample the wave function with 
a projective measurement and determine the 
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Fig. 1. Hardware-efficient encoding of the maximum independent set 
using Rydberg atom arrays. (A) An example of a unit disk graph, with any 
single vertex (e.g., the blue vertex) being connected to all other vertices 
within a disk of unit radius. (B) A corresponding MIS solution (denoted by the 
red nodes). (©) The maximum independent set problem is encoded with 
atoms placed at the vertices of the target graph and with interatomic spacing 
chosen such that the unit disk radius of the graph corresponds to the 
Rydberg blockade radius. Shown is an example fluorescence image of atoms, 


size of the output independent set by counting 
the number of qubits in |1), using classical post- 
processing to remove blockade violations and 
reduce detection errors (25) (Fig. 1E). This pro- 
cedure is repeated multiple times to estimate 
the mean independent set size (}>,n;) of the 
sampled wave function, the approximation 
ratio R = ()+;n;)/|MIS|, and the probability 
Pyns of observing an MIS (where |MIS| denotes 
the size of an MIS of the graph). The classical 
optimizer tries to maximize (}°n;) by updat- 
ing the variational parameters in a closed-loop 
hybrid quantum-classical optimization protocol 
(25) (Fig. 1D). 

We test two algorithm classes, defined by 
different parametrizations of the quantum 
driver and the cost function in Eq. 1. The first 
approach consists of resonant (A = 0) laser 
pulses of varying durations t; and phases 6; 
(Fig. 2A). This algorithm closely resembles the 
canonical quantum approximate optimization 
algorithm (QAOA) (3), but instead of exact 
single-qubit rotations, resonant driving gen- 
erates an effective many-body evolution within 
the subspace of independent sets associated 
with the blockade constraint (25). Phase jumps 
between consecutive pulses implement a global 
phase gate (29), with a phase shift propor- 
tional to the cost function of the maximum 
independent set problem in the subspace of 
independent sets (see eq. S2). Taken together, 
these implement the QAOA, where each pulse 
duration t; and phase 9; are used as a varia- 
tional parameters. 

The performance of QAOA as a function of 
depth p (the number of pulses) is shown in 
Fig. 2B for an instance of a 179-vertex graph 
embedded in a 15 x 15 lattice. We find that 
the approximation ratio grows as a function 
of the number of pulses up to p = 4, and 
increasing the depth further does not appear 
to lead to better performance (Fig. 2B). As 
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Encoding 


discussed in (25), we attribute these perform- 
ance limitations to the difficulty of finding 
the optimal QAOA parameters for large depths 
within a limited number of queries to the ex- 
periment, leakage out of the independent set 
subspace during resonant excitation due to 
imperfect blockade associated with the finite 
interaction energy between next-nearest neigh- 
bors, and laser pulse imperfections. 

The second approach is a variational quan- 
tum adiabatic algorithm (VQAA) (2, 30), 
related to methods previously used to prepare 
quantum many-body ground states (26, 31, 32). 
In this approach, we sweep the detuning A 
from an initial negative detuning Ag to a final 
large positive value A; at constant Rabi fre- 
quency Q, along a piecewise-linear schedule 
characterized by a total number of segments f, 
the duration 1; of each, and the end detuning 
A; of each segment. Moreover, we turn on the 
coupling Q in duration tg and smoothen the 
detuning sweep using a low-pass filter with a 
characteristic filter time t, (Fig. 2C), both of 
which minimize nonadiabatic excitations and 
serve as additional variational parameters. For 
this evolution, we define an effective circuit 
depth pas the duration of the sweep (T= 1 + 
... + T) in units of the m-pulse time 1,, which 
is the time required to perform a spin flip 
operation. 

We find that with only three segments op- 
timized for an effective depth of p = 10 (Fig. 2D 
inset), the optimizer converges to a pulse that 
substantially outperforms the QAOA approach 
described above. Furthermore, the optimized 
pulse shows a better performance compared 
to a linear (one-segment) detuning sweep of 
the same p (Fig. 2D). We find that similar 
pulse shapes produce high approximation 
ratios for a variety of graphs (see, e.g., fig. S8C), 
consistent with theoretical predictions of pulse 
shape concentration (20, 25, 33, 34). At large 
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Quantum evolution 


U(Q(t), dt), A(t), Viz) 


Readout 


with gray lines added to indicate edges between connected vertices. (D) The 
system undergoes coherent quantum many-body evolution under a pro- 
grammable laser drive [Q(t), o(t), A(t)] and long-range Rydberg interactions 
Vi. (E) A site-resolved projective measurement reads out the final quantum 
many-body state, with atoms excited to the Rydberg state (red circles) 
corresponding to vertices forming an independent set. A classical optimizer 
uses the results to update the parameters of the quantum evolution [Q(t), 
o(t), A(t)] to maximize a figure of merit for finding an MIS. 


sweep times (p > 15), we observe a turn-around 
in the performance likely associated with de- 
coherence (25). For the remainder of this work, 
we focus on the quantum adiabatic algorithm 
for solving maximum independent set. 


Quantum optimization on different graphs 


The experimentally optimized quasi-adiabatic 
sweep (depicted in Fig. 2D) was applied to 
115 randomly generated graphs of various 
sizes (N = 80 to 289 vertices). For graphs of the 
same size (V = 180), the approximation error 
1 - R decreases and the probability of finding 
an MIS solution Pyjs increases with the effec- 
tive circuit depth at early times, with the former 
showing a scaling consistent with a power-law 
relation for short effective depths (Fig. 3A and 
fig. S15) (25). We find a strong correlation 
between the performance of the quantum algo- 
rithm on a given graph and its total number 
of MIS solutions, which we refer to as the MIS 
degeneracy D)wis\(G) (hereafter D) 13). This 
quantity is calculated classically using a ten- 
sor network algorithm (25, 35) and varies by 
nine orders of magnitude across different 
180-vertex graphs. We observe a clear loga- 
rithmic relation between Djs; and the ap- 
proximation error 1 - R, accompanied by a 
nearly three-orders-of-magnitude variation of 
Pygs at a fixed depth p = 20 (Fig. 2B). Pyrg does 
not scale linearly with the MIS degeneracy, as 
would be the case for a naive algorithm that 
samples solutions at random. Figure 2C shows 
the sharp collapse of 1 - R as a function of the 
logarithm of the MIS degeneracy normalized 
by the graph size, p = log(Diwas))/N. This quan- 
tity, a measure of MIS degeneracy density, 
determines the hardness in approximating 
solutions for the quantum algorithm at shal- 
low depths. 

These observations can be modeled as re- 
sulting from a Kibble-Zurek-type mechanism 
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Fig. 2. Testing variational quantum algorithms. (A 
optimization algorithm (QAOA), consisting of sequent 
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7, and laser phase 4;. (B) Variational optimization of QAOA parameters results in a decrease in approximation 
error 1 — R, up to depth p = 4 (inset: example performance of quantum-classical closed-loop optimization 
at p = 5). Approximation error calculated using the top 50 percentiles of independent set sizes (1 — Ro5) is 
used as the figure of merit to reduce effects of experimental imperfections on the optimization procedure 
(25). (©) Quantum evolution can also be parametrized as a variational quantum adiabatic algorithm 
(VQAA) using a quasi-adiabatic pulse with a piecewise-linear sweep of detuning A(t) at constant Rabi 
coupling Q(t). Q(t) is turned on and off within to, and a low-pass filter with time scale t, is used to smoothen 
the A(t) sweep. (D) Performance of a rescaled piecewise-linear sweep as a function of its effective 
depth p = (q+ ... + t)/t,. Variational optimization of a three-segment (orange) piecewise-linear pulse 
(optimized for p = 10) improves on the performance of a simple one-segment linear (blue) pulse, as 
well as the best results from QAOA (inset: detuning sweep profiles for one-segment (blue) and three-segment 
(orange) optimized pulses, for a total pulse duration of 2.0 us). Error bars for approximation ratio R are 
the SEM here and throughout the text and are smaller than the points. 


where the quantum algorithm locally solves the 
graph in domains whose sizes are determined 
by the evolution time and speed at which 
quantum information propagates (36, 37). 
We show that the scaling of the approximation 
error with depth can originate from the con- 
flicts between local solutions at the boundaries 
of these independent domains (25). In graphs 
with a large degeneracy density p, there may 
exist many MIS configurations that are com- 
patible with the local ordering in these do- 
mains. This provides a possible mechanism 
to reduce domain walls at their boundaries 
(fig. S14) and decrease the approximation 
error. Such a scenario would predict a linear 
relation between 1 - R and p at a fixed depth, 
which is consistent with our observations 
(Fig. 2C and fig. S15). 
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Benchmarking against simulated annealing 

To benchmark the results of the quantum 
optimization against a classical algorithm, 
we use simulated annealing (SA) (38). It seeks 
to minimize the energy of a cost Hamiltonian 
by thermally cooling a system of classical spins 
while maintaining thermal equilibrium. Al- 
though some specifically tailored state-of- 
the-art algorithms (24, 39) may have better 
performance than SA in solving the maximum 
independent set problem, we have chosen 
SA for extensive benchmarking because sim- 
ilar to the quantum algorithms used, it is a 
general-purpose algorithm that only relies on 
information from the cost Hamiltonian for 
solving the problem. Our highly optimized 
variant of SA stochastically updates local clus- 
ters of spins using the Metropolis-Hastings 
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(40) update rule, rejecting energetically un- 
favorable updates with a probability depen- 
dent on the energy cost and the instantaneous 
temperature (25). We use collective updates 
under the MIS Hamiltonian cost function (eq. 
$15), which applies an optimized uniform inter- 
action energy to each edge, penalizing states 
that violate the independent set criterion (25). 
The annealing depth pg, is defined as the aver- 
age number of attempted updates per spin. 

We compare the quantum algorithm and SA 
on two metrics: the approximation error 1 - R, 
and the probability of sampling an exact solu- 
tion Pyys, which determines the inverse of time- 
to-solution. As shown in Fig. 4A, for relatively 
shallow depths and moderately hard graphs, 
optimized SA results in approximation errors 
similar to those observed on the quantum de- 
vice. In particular, we find that the hardness in 
approximating the solution for short SA depths 
is also controlled by degeneracy density p (fig. 
S18, A and B). However, some graph instances 
appear to be considerably harder for SA com- 
pared to the quantum algorithm at higher depths 
(see, e.g., gold and purple curves in Fig. 4A). 

Detailed analysis of the SA dynamics for 
graphs with low degeneracy densities p reveals 
that for some instances, the approximation ratio 
displays a plateau at R = (|MIS| - 1)/|MIS|, 
corresponding to independent sets with one 
less vertex than an MIS (Fig. 4A, gold and 
purple solid lines). Graphs displaying this be- 
havior have a large number of local minima 
with independent set size |MIS| - 1, in which 
SA can be trapped up to large depths. By 
analyzing the dynamics of SA at low temper- 
atures as arandom walk among |MIS| - 1 and 
|MIS| configurations (Fig. 4D), we show in 
(25) that the ability of SA to find a global 
optimum is limited by the ratio of the num- 
ber of suboptimal independent sets of size 
|MIS| - 1 to the number of ways to reach 
global minima, resulting in a “hardness pa- 
rameter” HP = Diyasj-1/(IMIS|Diis)) (Fig. 4E). 
This parameter lower bounds the mixing 
time for the Markov chain describing the SA 
dynamics at low temperatures (eq. S19), and 
it appears to increase exponentially with the 
square root of the system size for the hardest 
graphs (fig. S11). This suggests that a large 
number of local minima cause SA to take an 
exponentially long time to find an MIS for the 
hardest cases as N grows. If SA performance 
saturates this lower bound, consistent with 
numerics (fig. S19), its runtime to find an MIS 
is polynomially related to the best known 
exact classical algorithms (47). 


Quantum speedup on the hardest graphs 


We now turn to study the algorithms’ ability 
to find exact solutions on the hardest graphs 
(with up to N = 80), chosen from graphs in 
the top two percentile of the hardness pa- 
rameter HP (fig. S11). We find that for some 
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of these graphs (e.g., gold curves in Fig. 4, A to 
C), the quantum algorithm quickly approaches 
the correct solutions, reducing the average 
Hamming distance (number of spin flips 
normalized by N) to the closest MIS and in- 
creasing Pyys, While SA remains trapped in 
local minima at a large Hamming distance 
from any MIS. For other instances (e.g., purple 
curves in Fig. 4, A to C), both the quantum 
algorithm and SA have difficulty finding the 
correct solution. Moreover, in contrast to our 
earlier observations suggesting variational 
parameter concentration for generic graphs, 
we find that for these hard instances, the 
quantum algorithm needs to be optimized for 
each graph individually by scanning the slow- 
down point of the detuning sweep A(é) to max- 
imize Pyys (Fig. 5, A and B, and fig. S9) (25). 

Figure 4E shows the resulting highest Pyrs 
reached within a depth of 32 for each hard 
graph instance as a function of the classi- 
cal hardness parameter #/P. For simulated 
annealing, we find the scaling Pyys = 1 - 
exp(-CHP +), where C is a positive fitted 
constant, which is in good agreement with 
theoretical expectations (25). Although for many 
instances the quantum algorithm outperforms 
SA, there are significant instance-by-instance 
variations, and on average, we observe a sim- 
ilar scaling Pyrs = 1 - exp(-CHP~°-9°0) 
(dashed red line). 
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To understand these observations, we carried 
out detailed analyses of both classical and 
quantum algorithms’ performance for hard 
graph instances. Specifically, in (25) we show 
that for a broad class of SA algorithms with 
both single-vertex and correlated updates, the 
scaling is at best Pyys = 1 - exp(-CHP') 
(where C generally could have polynomial 
dependence on the system size), indicating 
that the observed scaling of our version of SA 
is close to optimal. To gain insight into the 
origin of the quantum scaling, we numeri- 
cally compute the minimum energy gap din 
during the adiabatic evolution using density- 
matrix renormalization group (Fig. 5A) (25). 
Figure 5C shows that the performance of the 
quantum algorithm is mostly well described by 
quasi-adiabatic evolution with transition prob- 
ability out of the ground state governed by the 
minimum energy gap, according to the Landau- 
Zener formula Pyrs = 1 — exp(—A6),,,) for a 
constant A, and n = 1.2(2) (42). This obser- 
vation suggests that our quantum algorithm 
achieves near-maximum efficiency, consistent 
with the smallest possible value of n = 1 obtained 
for optimized adiabatic following (43). 

By focusing only on instances with large 
enough spectral gaps such that the evolution 
time T obeys the “speed limit” determined by 
the uncertainty principle (6nin > 1/T) associated 
with Landau-Zener scaling (42), we find an 
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Fig. 3. Quantum algorithm 
performance across different 
graphs. (A) The approximation 
error 1 — R for an optimized 
quasi-adiabatic sweep plotted as a 
function of effective depth p on 
four graphs of the same size 

(N = 180 vertices), showing strong 
dependence on the number of 
MIS solutions (MIS degeneracy) 
Dymus) (inset: corresponding MIS 
probability Pys versus p). (B) At a 
fixed depth p = 20, 1 - R and 
Pwis for various 180-vertex graphs 
are strongly correlated with 
Dimisj- (©) At the same effective 
depth p = 20, 1 - R for 115 graphs 
of different sizes (N = 80 to 

289) and MIS degeneracies 

Dimi, exhibit universal scaling 
with the degeneracy density 

p = log(Dimis)/N (inset: data 
plotted as a function of N). Error 
bars for Pyyis, here and through- 
out the text, denote the 68% 
confidence interval. 


improved quantum algorithm scaling Pyyjs = 
1- exp(-CHP ° 0) (Fig. 4E, solid red line). 
Because 1/[-log( - Pygs)] = 1/Pis is propor- 
tional to the runtime sufficient to find a solu- 
tion by repeating the experiment, the smaller 
exponent observed in the scaling for quantum 
algorithm (~HP!°™ for SA and ~HP°% 
for the quantum algorithm) suggests a super- 
linear [with a ratio in scaling of 1.6(3)] speed- 
up in the runtime to find an MIS, for graphs 
where the deep-circuit-regime (T > 1/dyin) 
is reached. Moreover, the observed scaling 
is not altered by the postprocessing used on 
the experimental data (25). We emphasize 
that achieving this speedup requires an effec- 
tive depth large enough to probe the lowest- 
energy many-body states of the system; by 
contrast, no speedup is observed for graph 
instances where this depth condition is not 
fulfilled. 


Discussion and outlook 


Several mechanisms for quantum speedup 
in combinatorial optimization problems have 
been previously proposed. Grover-type algo- 
rithms are known to have a quadratic speedup 
in comparison to brute-force classical search 
over all possible solutions (44, 45). A quadratic 
quantum speedup has also been suggested 
for quantized SA based on discrete quantum 
walks (46, 47). However, these methods use 
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Fig. 4. Benchmarking the quantum algorithm against classical simulated 
annealing. (A) Performance of the quantum algorithm, and the optimized 
simulated annealing with the MIS Hamiltonian, shown as a function of depth ( for 
quantum algorithm and psa for simulated annealing) for four 80-vertex graphs. 
Green (HP = 1.8, p = 0.13) and gray (HP = 2.1, p = 0.11) graphs are easy 

for the quantum and classical algorithm; however, purple (HP = 69, p = 0.08) 
and gold (HP = 68, p = 0.06 are significantly harder and show a plateau at 

R = (\MIS| - 1)/|MIS|, i-e., independent sets with one less vertex than an MIS. 
(B and C) One of the hard graphs (gold) shows much better quantum scaling 

of average normalized Hamming distance to the closest MIS, and MIS probability 
(Pwis) compared to the other graph (purple). By contrast, the performance of 
SA (lines) remains similar between the two graphs. (D) Configuration graph 

of independent sets of size |MIS] and |MIS| — 1 for an example 39-vertex graph 


Hardness parameter HP 


(HP = 5), where the edges connect two configurations if they are separated 
by one step of simulated annealing. At low temperatures, simulated annealing 
finds an MIS solution by a random walk on this configuration graph. 

(E) -log(1 — Ps) for instance-by-instance optimized quantum algorithm (crimson) 
and simulated annealing (teal) reached within a depth of 32, for 36 graphs 
selected from the top two percentile of hardness parameter 71P for each size. 
Power-law fits to the SA (teal, ~HP™°3) and the quantum data (dashed crimson 
line, ~P~°°05)) are used to compare scaling performance with graph hardness 
HP. The error in the power-law exponents from the fit is the combination of 
statistical errors and the error in the least-squares fit. If only graphs with minimum 
energy gaps large enough to be resolved in the duration of the quantum evolution 
are considered (8:nin > 1/T, excluding hollow data points), the fit (Solid crimson line) 
shows a superlinear speedup ~HP °°") over optimized simulated annealing. 


specifically constructed circuits and are not 
directly applicable to the algorithms imple- 
mented here. In addition, the following mech- 
anisms can contribute to the speedup observed 
in our system. The quantum algorithm’s per- 
formance in the observed regime appears to 
be mostly governed by the minimum energy 
Zap Smin (Fig. 5C). We show that under cer- 
tain conditions, one can achieve coherent 
quantum enhancement for the minimum gap 
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resulting in a quadratic speedup Via din ~ 
HP’? (25). In practice, however, we find 
that the minimum energy gap does not always 
correlate with the classical hardness parame- 
ter HP, as is evident in the spread of the 
quantum data in Fig. 4E (see also fig. S21). 
Some insights into these effects can be gained 
by a more direct comparison of the quantum 
algorithm with SA using the same cost func- 
tion corresponding to the Rydberg Hamiltonian 
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(25) (Fig. 5D). Although the observed power- 
law scaling supports the possibility of a nearly 
quadratic speedup for instances in the deep 
circuit regime (8in > 1/7), it is an open ques- 
tion whether such a speedup can be extended, 
with a guarantee, in all instances. Finally, it 
is possible that 5,,in alone does not fully de- 
termine the quantum performance, as sug- 
gested by the data points that deviate from 
the Landau-Zener prediction in Fig. 5C, where 
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Fig. 5. Understanding hardness for the quantum algorithm. (A) Energy 

gap between the ground (black) and first-excited (blue) states, calculated 
using the density matrix renormalization group (DMRG) for a graph of 

65 atoms. (B) To maximize Puis for hard graphs, the frequency at which the 
detuning sweep is slowed down is varied (see fig. S9). The largest Pris 
corresponds to a slow-down frequency close to the location of the minimum gap. 
(C) Measured Pyis for a fixed effective depth p = 32 as a function of the 
calculated minimum gap Smin. For many instances, the relation is well described 


enhancement through diabatic effects could 
be possible (34, 48). 

Although the scaling speedup observed here 
suggests a possibility of quantum advantage in 
runtime, to achieve practical runtime speed- 
ups over specialized state-of-the-art heuristic 
algorithms [e.g., (39)], qubit coherence, system 
size, and the classical optimizer loop need to 
be improved. The useful depth accessible via 
quantum evolution is limited by Rydberg-state 
lifetime and intermediate-state laser scatter- 
ing, which can be suppressed by increasing the 
control laser intensity and intermediate-state 
detuning. Advanced error mitigation techniques 
such as STIRAP (49), as well as error correc- 
tion methods, should also be explored to enable 
large-scale implementations. The classical opti- 
mization loop can be improved by speeding 
up the experimental cycle time and by using 
more advanced classical optimizers. Larger 
atom arrays can be realized by using improve- 
ments in vacuum-limited trap lifetimes and 
sorting fidelity. 

Our results demonstrate the potential of 
quantum systems for the discovery of new 
algorithms and highlight a number of new 
scientific directions. It would be interesting 
to investigate whether instances with large 
Hamming distance between the local and glob- 
al optima of independent set sizes |MIS| - 1 and 
|MIS| can be related to the overlap gap property 
of the solution space, which is associated with 
classical optimization hardness (50). In par- 
ticular, our method can be applied to the 
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optimization of “planted graphs,” designed 
to maximize the Hamming distance between 
optimal and suboptimal solutions, which can 
provably limit the performance of local classi- 
cal algorithms (57). Our approach can also be 
extended to beyond unit disk graphs by using 
ancillary atoms, hyperfine qubit encoding, and 
a reconfigurable architecture based on coher- 
ent transport of entangled atoms (52). Fur- 
thermore, local qubit addressing during the 
evolution can be used to both extend the range 
of optimization parameters and the types 
of optimization problems (5). Further anal- 
ysis could elucidate the origins of classical and 
quantum hardness, for example, by using graph 
neural network approaches (53). Finally, sim- 
ilar approaches can be used to explore realiza- 
tions of other classes of quantum algorithm 
[see, e.g., (54)], enabling a broader range of 
potential applications. 
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Predator control of marine communities increases 
with temperature across 115 degrees of latitude 
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Early naturalists suggested that predation intensity increases toward the tropics, affecting fundamental 
ecological and evolutionary processes by latitude, but empirical support is still limited. Several studies have 
measured consumption rates across latitude at large scales, with variable results. Moreover, how predation 
affects prey community composition at such geographic scales remains unknown. Using standardized 
experiments that spanned 115° of latitude, at 36 nearshore sites along both coasts of the Americas, we found 
that marine predators have both higher consumption rates and consistently stronger impacts on biomass 
and species composition of marine invertebrate communities in warmer tropical waters, likely owing to 

fish predators. Our results provide robust support for a temperature-dependent gradient in interaction strength 
and have potential implications for how marine ecosystems will respond to ocean warming. 


he strength of species interactions, such 

as predation and competition, is thought 

to peak at low tropical latitudes and 

decline toward the poles (J). Such geo- 

graphic variation in interaction strength 
is invoked frequently as both a major cause 
and consequence of the latitudinal diversity 
gradient, one of the most robust patterns of 
life on Earth (2-5). However, studies available 
to date across large spatial scales and multi- 
ple habitats provide conflicting support 
for increased predation intensity in the tropics 
and have been mostly limited to measuring 
rates of prey loss. For example, predation 
intensity (consumption rate) on seeds (6) and 
terrestrial insect mimics (7) was greater in 
the tropics than at higher latitudes. By con- 
trast, attacks on open ocean long-line fishing 
hooks baited with natural prey peaked at mid- 
latitudes instead of the tropics (8), as did 
consumption of squid baits in shallow coastal 
waters (9). 

Currently, it remains largely unknown wheth- 
er global gradients in predation intensity pro- 
duce associated gradients in the magnitude of 
effects on prey communities, especially across 
latitudes. Such a gradient in community-level 
effects is likely to have profound consequences 
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for patterns of biodiversity (70), ecosystem func- 
tion (11, 12), and resilience to global change 
(13). Although some studies have found evi- 
dence for stronger effects of predation on 
community composition at tropical versus tem- 
perate sites, primarily in shallow-water marine 
benthic habitats (14-17), these were restricted 
to spatial scales of 20° to 45° latitude and 
usually along single coastlines. Other regional- 
scale studies in similar marine habitats did not 
detect this latitudinal pattern in community 
effects of predators (18, 19). Where latitudinal 
trends in predation intensity and impact have 
been observed at regional spatial scales, a 
number of environmental factors that follow 
a latitudinal gradient have been proposed as 
drivers of this pattern, including time since 
glaciation, lack of freezing winters, day length, 
and temperature (20). Ambient temperature is 
likely important because it strongly influences 
metabolic rates and underpins organism func- 
tioning and the ecology of populations, commu- 
nities, and ecosystems (27). Although temperature 
generally declines with latitude, the relationship 
varies among regions (Fig. 1). Thus, including 
in situ temperature as an independent predic- 
tor could help to explain the mixed results 
from previous studies. Clarifying the relation- 
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ship between predation intensity, impacts on 
prey communities, and temperature could also 
facilitate prediction of community response to 
future ocean warming. 

We tested whether intensity of predation 
and its community-level effects decrease from 
tropical to subpolar latitudes in coastal marine 
ecosystems. Specifically, we assessed the im- 
pact of fish and other large, mobile predators 
on sessile marine invertebrate communities. 
We used standardized and replicated exper- 
iments at 36 nearshore sites across 115° of 
latitude, along both Pacific and Atlantic coasts 
of the Americas (Fig. 1 and table S1). We con- 
ducted three complementary experiments to 
test whether predation intensity and top-down 
control of prey communities vary consistently 
along latitudinal and temperature gradients 
in both hemispheres. We focused on coastal 
subtidal communities of sessile invertebrates 
on hard substrates for multiple reasons. These 
communities are widely distributed through- 
out the world and are especially conducive to 
experiments, responding rapidly to manipula- 
tion and allowing for robust tests of general 
ecological processes (3, 22). There is also evi- 
dence that top-down control is stronger in the 
tropics than in temperate regions for these 
hard-substrate communities at some regional 
scales (14-16, 18, 23). We expanded on this past 
work to test with high replication whether 
results are consistent on an extensive geographic 
scale, across the Americas in two oceans (24). 

Our experiments measured three separate 
components of predation: (i) consumption 
of a standard bait as a measure of predation 
intensity, (ii) effects of sustained predation 
on the development of benthic community 
composition and biomass over 3 months, and 
(iii) the effects of short-term predation on 
already developed benthic communities (table 
82) (24). The three complementary predation 
measures were colocated in space and time at 
each site. To compare predator consumption 
rates on a broadly palatable prey for the first 
component, we used dried squid as a stan- 
dardized bait at all sites and recorded bait loss 
after 1 hour as a measure of predation inten- 
sity (25). For the second and third components, 
we allowed natural communities to develop on 
standardized substrates for 3 months (5) and 
manipulated predator access at different time 
points in community assembly, to evaluate the 
effect of predation on composition and biomass 
of sessile invertebrate communities (24). Cages 
were designed and used in both experiments 
to selectively exclude and evaluate effects of 
large (>1 cm) mobile predators, especially 
fishes, which are major consumers of benthic 
invertebrate prey in shallow subtidal habitats 
and can affect their community composition 
(14-18, 23). The second component contrasted 
communities developed continuously under 
caged versus uncaged control conditions for 
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Fig. 1. Site location and mean temperatures. Location, latitude, and mean temperature recorded at experimental sites on Atlantic (triangle) and Pacific (circle) 


coastlines of the Americas. Color scale indicates grad 


12 weeks. For the third component, we allowed 
communities to develop for 10 weeks in cages 
and then uncaged half of these, comparing ef- 
fects of predator exposure on these established 
communities after 2 additional weeks. We also 
measured temperature at each site throughout 
the experiments using dataloggers (24). 


We analyzed the results with mixed effects 
models and a model selection approach, with 
separate global models estimating the responses 
of bait consumption; sessile community bio- 
mass; and community composition to varia- 
tion in seawater temperature or latitude, ocean 
basin, hemisphere, caging treatment, and inter- 


ient in temperature recorded across latitudes during the experiment (dark blue, ~9°C; dark red, ~31°C). 


actions among all these terms. We explicitly 
compared alternate models that included either 
latitude or temperature recorded during the 
experiment to evaluate which was a better 
predictor of predator effects (24). 

Our results provide robust experimental 
evidence that top-down control of community 
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Fig. 2. Modeled variation in predation 
intensity and responses of biomass 
and community composition to preda- 
tion with increasing temperature. 

(A) Predation measured as bait loss 
increased with in situ temperature along 
Atlantic and Pacific coastlines of the 
Americas. The line indicates predictions 
from a generalized linear mixed effects 
model [conditional coefficient of 
determination (R*) = 0.79]. (B) The 
effect of predation on biomass accumu- 
ation increased with temperature. 

Dark blue indicates predators were 
excluded throughout the experiment; 
green indicates predators were excluded 
until the last 2 weeks of the experiment 
and then the experiment was exposed 
to predators; and yellow indicates 

open to predators throughout the 
experiment (model conditional R* = 
0.89). Predators consumed significantly 
more biomass as temperature increased 
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between 9° and 31°C. (C) Effect of predation on community composition increased along the latitudinal temperature gradient. Exclusion of predators throughout 
the 3-month experiment (gold, caged versus controls) had a greater impact on community composition than 2-week exposure (blue, caged versus exposed 

cage) of the late-stage community to predators. Lines show effect size as predictions from linear models of square roots of the estimated component of variation 
for each contrast within each site. Shaded areas show 95% confidence intervals (Cls) (24). 


structure consistently increases with temper- 
ature and is strongest in the tropics, supporting 
a major tenet in ecology and evolutionary biol- 
ogy. Predation intensity and its effects on marine 
hard-substrate communities increased from 
colder high-latitude to warmer tropical waters 
(Fig. 2). Seawater temperature and latitude 
were strongly correlated [correlation coef- 
ficient (7) = 0.84], and although results were 
qualitatively similar for seawater temperature 
and absolute latitude, the models with seawater 
temperature were more strongly supported 
for both predation intensity and community 
responses (24). Predation intensity, as mea- 
sured in the first experiment with bait con- 
sumption, was greatest in the warm tropics 
and approached zero at sites where mean sum- 
mer sea surface temperature was below ~20°C 
(Fig. 2A, fig. S2, and table S3). Whereas the 
bait loss assay provides a short-term (1 hour) 
measure of predation intensity, the two caging 
experiments integrate longer-term impacts of 
predators on community attributes, revealing 
that predators had consistently larger effects 
on communities at higher temperatures and 
during multiple stages of community develop- 
ment. Specifically, in the second experiment, 
the effect of predators increased with tem- 
perature for both biomass accumulation (wet- 
weight) (Fig. 2B, fig. S3, and table S4) and 
community composition (Fig. 2C, figs. $4 to S6, 
and tables S5 to S7). In the third experiment, 
predators reduced prey community biomass in 
warmer tropical waters during the 2-week ex- 
posure, compared with communities that re- 


Ashton et al., Science 3'76, 1215-1219 (2022) 


mained caged, and biomass of these exposed 
communities converged on uncaged control 
treatments across all temperatures (Fig. 2B 
and table $4). Community composition also 
responded more strongly to this later-stage 
predation at warmer sites (Fig. 2C and table 
S6). Thus, results of these three complementary 
experiments provide strong and consistent 
evidence that predation intensity by mobile 
predators is higher on average, and shapes 
community composition more strongly, in warm 
tropical waters. 

The organisms that changed most in re- 
sponse to predators were solitary tunicates 
and encrusting bryozoans; dominance of these 
groups diverged among treatments with in- 
creasing temperature (fig. S4). At warm water 
sites, encrusting bryozoans were most preva- 
lent on open control panels, whereas solitary 
tunicates occurred most frequently on caged 
panels that restricted predator access (Fig. 3 
and table S7, C and D). This pattern may re- 
sult from competitive release of less palatable 
bryozoans when spatially dominant tunicates 
are removed by predators during commu- 
nity assembly (19, 26). When later-stage trop- 
ical communities were exposed to predators, 
solitary tunicate dominance was reduced 
(compared with caged panels), with a coinci- 
dent increase in bare space (Fig. 3). Bare space 
decreased toward the tropics in all treatments. 
It is likely that prevalence of large solitary 
tunicates drove the observed higher biomass 
in treatments protected from predators at 
most sites (Fig. 2B). 
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Although we found a strong overall increase 
in predation intensity and top-down control at 
warmer temperatures, the scale of the responses 
varied among ocean basins and hemispheres. 
For example, bait loss and community com- 
position responses were more marked in the 
northern hemisphere (figs. $2, A and B, and 
S6B), whereas the biomass response of prey 
communities was more apparent in the North 
Atlantic and South Pacific than other regions 
(fig. S3B). This variation likely derives from 
regional differences in the species and func- 
tional characteristics of predators and prey, envi- 
ronmental conditions other than temperature, 
and/or biological factors beyond those mea- 
sured here (such as productivity) (23). Funda- 
mental differences in oceanography exist at the 
ocean basin scale (for example, equatorial upwell- 
ing on the Pacific coastline is largely absent 
from the Atlantic sites) that would be expected 
to have effects on the observed latitudinal pat- 
terns (27). More broadly, the variation among 
sites underscores the need for high replication 
and broad geographic coverage to thoroughly 
evaluate both regional and global patterns. 

This study provides new insights into the 
macroecological pattern of biotic interactions. 
We show that intensity of predation indeed 
declines consistently with latitude, as expected, 
but is better predicted by mean summer tem- 
perature experienced during the experiment 
than by latitude, hinting at underlying mech- 
anisms. We demonstrate that this gradient in 
predation intensity produces a parallel gradi- 
ent in top-down control of marine community 
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Fig. 3. Effects of predator treatments on community composition at a tropical Atlantic site and 
response of key functional groups from models based on all sites. (A to C) Photographs illustrate 
differences among experimental treatments at Bocas del Toro, Panama. At this and other warm water sites, 
encrusting bryozoans predominated in (A) control panels (exposed to predators), (B) solitary tunicates in 
caged panels (predators excluded), and (C) bare space in exposed cage panels [as in (B) but exposed to 
predators for the last 2 weeks through cage removal]. (D to F): Modeled percent cover across all sites of (D) 
encrusting bryozoans, (E) solitary tunicates, and (F) bare space, which together explained most of the 
variation in community composition among treatments (yellow, controls; dark blue, caged; green, exposed 
cage) in warm water sites. Shaded areas show 95% Cls (24). 


biomass and composition that has been long 
suspected but not rigorously tested at this scale. 
As predicted, predation intensity in our shallow 
hard-substrate communities increased with 
temperature, similar to the patterns of bait 
loss in terrestrial and marine environments 
over an expansive latitudinal range (7, 9). Our 
results were likely driven by highly mobile 
fish that can exert strong effects on epibenthic 
invertebrates in warm tropical water (14-18, 23). 
We recognize that predation effects may differ 
for marine communities in other habitat types, 
including those where macroinvertebrates exert 
strong predation effects (3, 27). More specifically, 
other studies in marine systems have shown a 
variety of patterns (8, 9, 28), which may reflect 
physical differences among habitats, taxonomic 
composition of predator or prey groups, smaller 
spatial scales, or less replication. 

Overall, our analyses demonstrate a strong 
temperature-dependent gradient of increasing 
predator impacts on community biomass and 
composition and support prior predictions 
of stronger interaction strengths at warmer 


Ashton et al., Science 376, 1215-1219 (2022) 


latitudes based on regional-scale studies [for 
example, (15, 17)]. This study, completed at a 
large spatial scale, contributes to mounting 
evidence that temperature is a key predictor 
of global gradients, not only in diversity (29) 
and a suite of biological processes (27) but also 
in the strength of interactions among species 
(30, 31) and the resulting effects of those inter- 
actions on communities. 

Our results imply that climate change may 
have predictable effects on the regulation of 
nearshore communities along the world’s shore- 
lines. Our finding of a fundamental relation- 
ship between temperature and predation effects 
across large geographic scales suggests that, in 
addition to shifting species’ distributions (32), 
ocean warming may cause the intensity of 
top-down control to expand poleward (Fig. 4). 
Specifically, the observed temperature-predation 
relationship exhibits an inflection point at ~20°C 
(Fig. 2) (19) that will likely move poleward 
with warming (Fig. 4), both promoting top- 
down control at high latitudes and increas- 
ing predation effects at mid- to high latitudes 
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Current 
conditions 


Top-down control 


60 Latitude 


Fig. 4. Conceptual illustration of the hypothe- 
sized impact of ocean warming on future trends 
in top-down control of marine communities. 
Predation intensity was low and had little or no 
effect on benthic communities at cold latitudes and 
increased toward the equator with temperature, 
above an inflection point (~20°C). The black line 
describes a simplified view of the current latitudinal 
pattern of top-down control in our study. The solid 
red line describes the hypothesized effect of 
future ocean warming, which may shift this inflec- 
tion point poleward, increasing predation effects at 
higher latitudes. The dashed red line describes a 
region of uncertainty in the tropics, where increased 
temperatures exceed our current observations 

and possibly thermal tolerance of some predators, 
so that top-down control may increase or decline 
within this region (shaded to suggest a range of 
possible responses). 


through time (33). The response to warming 
is less certain in the tropics, where predation 
may increase or decrease, because projected 
temperature increases are beyond our cur- 
rent range of observations and may exceed 
thermal tolerances of existing predators. Such 
broad-scale shifts in top-down control could 
have far-reaching consequences, given the key 
role of species interactions in maintaining eco- 
system structure, diversity, biogeochemical pro- 
cesses, and the provision of critical ecosystem 
services to human communities (3, 73). 


REFERENCES AND NOTES 


1. D. W. Schemske, G. G. Mittelbach, H. V. Cornell, J. M. Sobel, 
K. Roy, Annu. Rev. Ecol. Evol. Syst. 40, 245-269 (2009). 

. Hillebrand, Am. Nat. 163, 192-211 (2004). 

. T. Paine, Am. Nat. 100, 65-75 (1966). 

H. Connell, Annu. Rev. Ecol. Syst. 3, 169-192 (1972). 

. J. Vermeij, E. Zipser, E. C. Dudley, Paleobiology 6, 352-364 (1980). 

L. Hargreaves et al., Sci. Adv. 5, eaau4403 (2019). 

. Roslin et al., Science 356, 742-744 (2017). 

. Roesti et al., Nat. Commun. 11, 1527 (2020). 

. A. Whalen et al., Proc. Natl. Acad. Sci. U.S.A. 117, 

60-28166 (2020). 

. L. Freestone, B. D. Inouye, Ecology 96, 264-273 (2015). 

. L. Freestone, G. M. Ruiz, M. E. Torchin, Ecology 94, 
1370-1377 (2013). 

. B. S. Cheng, A. H. Altieri, M. E. Torchin, G. M. Ruiz, Ecology 
100, 202617 (2019). 


CON RARWwWH 


oO 


i 
SEN ZeZenyrorcwase 


IL. 


be 
i 


4 of 5 


RESEARCH | REPORT 


13. J. A. Estes et al., Science 333, 301-306 (2011). 

14. A. L. Freestone, R. W. Osman, G. M. Ruiz, M. E. Torchin, 
Ecology 92, 983-993 (2011). 

15. A. L. Freestone et al., J. Anim. Ecol. 89, 323-333 (2020). 

16. A. L. Freestone et al., Ecology 102, e03428 (2021). 

17. G. M. Dias et al., Divers. Distrib. 26, 1198-1210 (2020). 

18. L. P. Kremer, R. M. da Rocha, Biol. Invasions 18, 3223-3237 

(2016). 

19. J. T. Lavender, K. A. Dafforn, M. J. Bishop, E. L. Johnston, 

Ecology 98, 2391-2400 (2017). 

20. A. Moles, J. Ollerton, Biotropica 48, 141-145 (2016). 

21. J. H. Brown, J. F. Gillooly, A. P. Allen, V. M. Savage, G. B. West, 

Ecology 85, 1771-1789 (2004). 

22. J. J. Stachowicz, R. B. Whitlatch, R. W. Osman, Science 286, 

577-1579 (1999). 

23. M. E. Torchin et al., Ecology 102, e03434 (2021). 

24. Materials and methods are available as supplementary materials. 

25. J. E. Duffy, S. L. Ziegler, J. E. Campbell, P. M. Bippus, 

J. S. Lefcheck, PLOS ONE 10, e0142994 (2015). 

26. G. R. Russ, Oecologia 53, 12-19 (1982). 

27. K. Véliz, N. Chandia, K. Bischof, M. Thiel, J. Phycol. 56, 

090-1102 (2020). 

28. C. A. Musrri et al., Mar. Biol. 166, 142 (2019). 

29. D. P. Tittensor et al., Nature 466, 1098-1101 (2010). 

30. T. Wernberg et al., Ecol. Lett. 13, 685-694 (2010). 

31. R.L. Kordas, C. D. G. Harley, M. |. O'Connor, J. Exp. Mar. Biol. Ecol. 
400, 218-226 (2011). 

32. E. S. Poloczanska et al., Front. Mar. Sci. 3, 1-21 (2016). 

33. V. Kubelka et al., Science 362, 680-683 (2018). 

34. G. V. Ashton et al., Data for “Predator control of marine 
communities increases with temperature across 115 degrees of 
latitude”. figshare (2022); https://doi.org/10.25573/serc. 
19469900. 


ACKNOWLEDGMENTS 


This paper is a product of the Pan American Experimental Initiative 
in Marine Macroecology (PanAmEx), supported by numerous 
people and grants since its inception. We are particularly indebted 
to field assistants, host marinas and ports, and institutions for their 
help in establishing and maintaining the sites. For their particular 
contributions we thank N. Abrain Sanchez, G. Agurto Rodriguez, 
P. Albuquerque, R. Altamirano, L. P. Alves, H. Galo Andrade, APPM 
Port Administration, M. Araya, A. Arnwine, C. Arriola, K. Bachen, 
M. Badillo-Aleman, J. Barley, H. Bartsch, N. Battini, L. Bent-Hooker, 


Ashton et al., Science 3'76, 1215-1219 (2022) 


A. Beylan, J. Bleuel, E. Bobadilla, X. Boza, D. Branson, J. Bucholz, 
J. Bueno, A. Bungay, J. P. Carvalho Lima, Y. N. Casanova Salazar, 
M. C. Castellanos, R. Castillo, K. Castro, C. Cesar, |. Chacon, Club 
Nautico AFASyN, Club Nautico Mar del Plata, F. Contrera, 

M. Correal, L. Simioni Costa, C. Cruz-Gomez, K. Curiston, |. Davidson, 
Y. Davila, M. De Koster (Maritime Gloucester), D. de Miranda Lins, 
L. de Moura Oliveira, M. Lucila Del Gener, C. Détrée, A. Dias Kassuga, 
S. Diaz, R. DiMaria, R. Duefias, K. Escalante-Herrera, ESPOL 
Polytechnic University, L. Falsone, J. Fedex, D. Fernandez Barboza, 
M. Ferreira Valenga, B. Figueroa Soler, O. Florentin, 

L. Freyre, A. Fudge, Galapagos Biosecurity Agency, C. Giachetti, 

A. Giamportone, J. Gomez-Vasquez, J. Gonzalez, M. G. Vazquez, 

P. Guadarrama, E. Guerra, C. Guerra, J. Hardee, M. Harrigan Garfias 
(API, Huatulco), S. Havard, Y. A. Hernandez, M. Hessing-Lewis, 

N. Hitchcock, T. Huber, |. Clube de Natal, K. Inagaki, Instituto 
Nacional de Pesca, Inversiones Marina Turistica S.A., V. Jenkins, 
M. Kronman (Santa Barbara Harbor), M. Lacerda (Porto Real 
Marina), H. Lambert, K. Larson, A. Leduc, G. Lima, B. Lonzetti, 

F. Lopes Penha, E. Macaya, S. Martinez, K. Matheson, 
J. Medeiros de Oliveira, T. Mendoza, R. Menezes, H. F. Messano, 
M. Mews, M. Minton, K. Mitchell, J. Monteiro, A. Morales (API, México), 
T. Mullady, T. Murphy, K. Newcomer, S. Obenat, E. Oliveira, N. Ortiz, 
M. Oxxean, E. V. Paiva Bandeira, T. Palyo, L. Pardo, S. Pegau, 

N. Bonnemasou Peixoto, T. Pereira Menezes, R. Periera Menezes, 
J. Pinochet, C. Prentice, A. Ramos-Morales, M. Ramos-Sanchez, 
C. Reveles, V. Reyes, A. Reynolds, M. Saldajia, Salinas Yacht Club, 
J. P. Sanchez-Ovando, S. A. Santa-Cruz, K. Savage, F. Silva, 

. Silva-Morales, C. Simkanin, G. Smith, S. Soria, C. Soto-Balderas, 
A. Sousa Matos, D. Sparks, A. Tissot, M. Torres, K. Treiberg 
(Santa Barbara Harbor), D. Ugalde, M. Urefia, L. Vallejos, 

D. Van Maanen, R. van Velzen, J. Vega-Sequeda, M. Vegh, 

R. B. Vera, J. J. Vera Duarte, M. Vergotti, M. Vieira da Silva, 

A. Villeneuve, W. Wied, T. Wells, R. Whippo, N. Williams, and Yacht 
Club de IIhabela. We also thank the reviewers for their contribution 
0 improving this manuscript for publication. Funding: This 
research has been supported by multiple grants and institutions, 
including Smithsonian Institution Hunterdon and Johnson Fund 

o G.MLR. and J.E.D.; NSF OCE grant 1434528 to A.L.F., G.M.R., and 
M.E.T.; financial support from FAPESP 2016/17647-5 and CNPq 
308268/2019-9 to G.M.D.; Galapagos Conservancy, Lindblad 
Expedition/National Geographic Fund, Galapagos Conservation 
rust, Paul M. Angell Foundation and Ecoventura, and the Charles 
Darwin Foundation to I.K.; Fisheries and Oceans Canada, Aquatic 


10 June 2022 


Invasive Species Science Program to C.H.M.; ANID-FONDECYT 

(# 1180647) and CeBiB (FB-0001) to A.H.B.; CONICYT-FONDECYT 
#1190529 and FONDAP #15150003 (IDEAL) to N.V.; ANID 
(ICN2019_015 and NCN19_05) to S.A.N.; Chilean Millennium 
Scientific Initiative (ESMOI), Chile, to M.M. and M.T.; ANID- 
FONDECYT #1190954 to M.T.; the Tula Foundation to M.W. and 
N.B.; CONICET-PIP 20130100508, ANPCyT-PICT 2016-1083 to 
E.S. and A.B.; CONICET, PIP 2018-2020. 11220170100643CO to 
M.A.; CAPES (Finance Code 001) to E.A.V.; FAPESP 2018/11044-2 
to A.A.V.F.; CNPq 301601/2016-6 to A.A.V.F.; CNPq 309295/ 
2018-1 to R.M.R.; Serrapilheira Institute (Serra-1708-15364) and 
CNPq (310517/2019-2) to G.O.L.; CONACyT 2018-000012- 
OINACF-08376 to L.A.P.-A.; CONACyT 2019-000002-O01NACF- 
14266 to N.Y.S.-M.; and the Smithsonian Institution through the 
Tennenbaum Marine Observatory Network. This publication is 
contribution number 2335 of the Charles Darwin Foundation for 
the Galapagos Islands and contribution number 103 of the 
Tennenbaum Marine Observatories Network and MarineGEO 
Program. Author contributions: G.V.A., A.L.F., J.E.D., G.M.R., 
and M.E.T. conceived the project and design; G.V.A. and B.T. 
coordinated data collection; G.V.A. and B.J.S. analyzed the data; 
G.V.A., J.E.D., A.LF., G.M.R., B.J.S., and M.E.T. wrote the 
manuscript drafts; all authors collected and/or helped collect 
field data. All authors commented on and/or approved the 
manuscript. Competing interests: The authors declare no 
competing interests. Data and materials availability: Data from 
this work can be found at (34). License information: Copyright © 
2022 the authors, some rights reserved; exclusive licensee 
American Association for the Advancement of Science. No claim to 
original US government works. https://www.science.org/about/ 
science-licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.abc4916 
Materials and Methods 

Figs. Sl to S6 

Tables S1 to S7 

References (35-50) 

MDAR Reproducibility Checklist 


Submitted 25 November 2020; resubmitted 16 September 2021 
Accepted 3 May 2022 
10.1126/science.abc4916 


5 of 5 


RESEARCH 


MICROBIOME 


Robust variation in infant gut microbiome assembly 
across a spectrum of lifestyles 


Matthew R. Olm'+, Dylan Dahan‘}, Matthew M. Carter’, Bryan D. Merrill’, Feiqiao B. Yu, Sunit Jain’, 
Xiandong Meng®, Surya Tripathi*, Hannah Wastyk?, Norma Neff, Susan Holmes?®, 


Erica D. Sonnenburg’, Aashish R. Jha®, Justin L. Sonnenburg 


1,25. 


Infant microbiome assembly has been intensely studied in infants from industrialized nations, but little is 
known about this process in nonindustrialized populations. We deeply sequenced infant stool samples 
from the Hadza hunter-gatherers of Tanzania and analyzed them in a global meta-analysis. Infant 
microbiomes develop along lifestyle-associated trajectories, with more than 20% of genomes detected 
in the Hadza infant gut representing novel species. Industrialized infants—even those who are breastfed— 
have microbiomes characterized by a paucity of Bifidobacterium infantis and gene cassettes involved 

in human milk utilization. Strains within lifestyle-associated taxonomic groups are shared between mother- 
infant dyads, consistent with early life inheritance of lifestyle-shaped microbiomes. The population- 
specific differences in infant microbiome composition and function underscore the importance of 
studying microbiomes from people outside of wealthy, industrialized nations. 


he human gut microbiome undergoes a 
complex process of assembly beginning 
immediately after birth (7). New microbes 
attempting to engraft within this com- 
munity often depend upon niches estab- 
lished by previous colonizing species and thus 
the final adult microbiome composition may 
be contingent upon the species acquired early 
in life. The microbiome assembly process of 
infants living in industrialized nations is well 
studied and tends to follow a series of char- 
acterized steps that lead to the low-diversity 
gut microbiome composition characteristic 
of industrialized adults (2). The microbiome 
assembly process that occurs in infants living 
nonindustrialized lifestyles (which results in 
the characteristically diverse adult microbiomes 
of nonindustrialized adults) (3) is largely un- 
known (4). Of particular interest are the fol- 
lowing: the timing at which the microbiomes 
of infants from different lifestyles diverge, the 
microbes and functions that are characteristic 
of infants from different lifestyles, and whether 
there are differences in the taxa that are ver- 
tically transmitted from mothers to infants, 
which seed the microbiome assembly process. 
To address these questions we performed 
metagenomic sequencing on infant fecal samples 
from the Hadza, a group of modern hunter- 
gatherers in sub-Saharan Africa (5, 6). The Hadza 
inhabit seminomadic bush camps of ~5 to 30 
people, exhibit a moderate level of commu- 
nity child rearing within these camps (7), and 
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are breastfed early in life and weaned onto a 
diet of baobab powder and premasticated 
meat at ~2 to 3 years of age (8, 9). In this study 
we (i) curated and analyzed a global dataset of 
1900 16S rRNA sequencing samples of healthy 
infant fecal samples from 18 populations (in- 
cluding 62 Hadza infant samples) (2, 3, 5, 10-14) 
to contextualize the Hadza infant microbiome 
(figs. S1 and $2), and (i) performed deep meta- 
genomic sequencing on 39 Hadza infant fecal 
samples and corresponding maternal fecal sam- 
ples for 23 infants in order to assess subspecies 
variation, functional potential, and patterns of 
vertical transmission (tables S1 and $2). 

A UniFrac ordination created from all 1900 
16S rRNA sequencing samples revealed age 
and lifestyle to be strongly associated with the 
first and second axes of variation, respectively 
(Fig. 1A) (EnvFit; 2 = 1900; R” = 0.43 and 0.50; 
P = 0.001 and 0.001). Comparing populations 
that practice different lifestyles within the 
same country demonstrates that shared life- 
style affects microbiota composition more than 
geographic proximity (Fig. 1A, right panel, and 
fig. S3). The microbiome of infants living in- 
dustrialized lifestyles diverges from others 
within the first 6 months of life, whereas the 
microbiomes of infants living transitional 
versus nonindustrialized lifestyles diverge at 
~30 months of life (Fig. 1B). DNA extraction 
methods, differences in feeding practices, or 
other study-specific aspects may contribute 
to some of the variation in data. Intermedi- 
ate trajectories are exhibited by populations 
on the boundaries of industrialized or non- 
industrialized lifestyles (Fig. 1B, dashed lines), 
highlighting the apparent sensitivity of infant 
microbiota development to lifestyle-related 
factors. 

We identified five microbial coabundance 
groups (CAGs) (15, 16) in our dataset, which 
together account for an average of 93.8% of the 


microbiota composition per sample (Fig. 1C 
and fig. S4). The Bifidobacterium-Streptococcus 
CAG dominates infants from all lifestyles in 
early life (0 to 6 months), and over time this 
CAG yields to the Bacteroides-Ruminocccocus 
gnavus CAG in industrialized infants and the 
Prevotella-Faecalibacterium CAG in infants 
living transitional or nonindustrialized life- 
styles (Fig. 1C). Lifestyle-related differences 
in dominant CAGs become more pronounced 
over time and mirror taxonomic trade-offs ob- 
served in late infancy (17) and adulthood (5). 

We next used our deep metagenomic se- 
quencing data to assess microbiome-encoded 
functional differences between lifestyles. Broad 
lifestyle and age associated differences are seen 
in the overall functional capacity of the infant 
microbiomes (Fig. 2A), consistent with 16S rRNA 
amplicon-based analysis (Fig. 1A). Hadza infant 
metagenomes were assembled and binned into 
metagenome-assembled genomes (MAGs) repre- 
senting 745 species, 175 (23.4%) of which rep- 
resent novel species compared to the Unified 
Human Gastrointestinal Genome (UHGG) col- 
lection (78) (table S3). Novel species were re- 
covered from diverse phylogenetic groups (fig. 
S5A); 88.6% (m = 155) were recovered from 
multiple Hadza samples (fig. S5B) and their 
genome quality was observed to be similar to 
that of genomes in the UHGG (fig. S5C). To 
assess prevalence through read mapping, 
MAGs were integrated with genomes recov- 
ered from Hadza adults (19) and public 
genomes from the human gut (/8) into a 
comprehensive database of 5755 species- 
representative genomes. Overall, 23.4% of 
microbial species detected in the Hadza infants 
represent novel species (table S4). These data 
support that—similar to the adult Hadza gut— 
the Hadza infant gut contains extensive pre- 
viously uncharacterized diversity. 

The taxonomic specificity afforded by meta- 
genomic sequencing allowed us to identify 
particular species that are depleted or en- 
riched in infants living industrialized versus 
nonindustrialized lifestyles. Identified among 
the infants in this analysis were 310 VANISH 
(Volatile and/or Negatively associated in In- 
dustrialized Societies of Humans) and 12 
BloSSUM (Bloom or Selected in Societies of 
Urbanization/Modernization) species (table S5 
and fig. S6). Comparison against a large database 
of microbial species from nonhuman habitats 
(20) revealed that no VANISH and only one 
BloSSUM species match genomes recovered 
outside of the digestive tract or industrial waste- 
water, whereas 21 VANISH and three BloSSUM 
species match microbes recovered from non- 
human animal feces (table S6). VANISH spe- 
cies are more numerous and abundant than 
BloSSUM (fig. S7), and 63 VANISH species are 
effectively extinct (never detected) in infants 
living industrialized or transitional lifestyles. 
Many VANISH species (45.2%; 140 of 310) are 
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present at 0 to 6 months in nonindustrialized 
infants whereas BloSSUM species are rarely 
detected this early in industrialized lifestyle 
infants (16.7%; 2 of 12) (Fig. 2B). Together these 
patterns suggest that more species are lost 
than gained as lifestyles industrialize. 

Amplicon and metagenomic data both show 
that Bifidobacterium is the most prevalent 
taxon in early life (Figs. 1C and 2B). In the first 
6 months, infants living nonindustrial lifestyles 
are dominated by Bifidobacterium infantis 
(also known as Bifidobacterium longum subsp. 
infantis) (Fig. 2C), a prolific utilizer of human 
milk oligosaccharides (HMOs) that is positively 
associated with human health and commonly 
used in probiotic supplements (27). B. infantis 
is significantly depleted in industrial micro- 
biomes at 0 to 6 months (P = 0.04; n = 151 
industrialized infants; n = 27 nonindustrial 
infants; Wilcoxon rank-sum test) and found 
at intermediate levels in transitional infants 
(Fig. 2C). Bifidobacterium breve, a species 
capable of limited HMO degradation (22), is 
instead the most abundant Bifidobacterium 
species in industrialized infants (Fig. 2C). 
B. infantis is antiassociated with B. breve in 
infants across all lifestyles (Fig. 2D). This trend 
also holds specifically among industrialized 
infants (correlation = -0.41, P = 1.0 x 10°, n= 
62 industrialized infants, Spearman two sided 
hypothesis test), suggesting it may be driven 
by competitive exclusion rather than lifestyle- 
specific factors. 

To determine whether these species-level dif- 
ferences result in community-wide differences 
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in HMO degradation capacity, we mapped 
our metagenomic reads to the most well- 
characterized genetic clusters for human milk 
utilization (table S7). Five of these clusters are 
involved in HMO degradation (H1 to H5) and 
one is involved in nitrogen scavenging (referred 
to as the “urease” cluster) (21, 23); recent studies 
have linked their expression in the infant gut 
microbiome to systemic immunological health 
outcomes (24). Five of the six clusters are more 
prevalent in nonindustrialized than industri- 
alized infants, and their prevalence among 
transitional infants occurs between these two 
extremes (Fig. 2E). The H5 cluster, however, 
exhibits continued persistence beyond the first 
year of life only in infants from industrialized 
lifestyles (Fig. 2E). The H5 cluster encodes an 
ABC-type transporter known to bind core HMO 
structures, and it is more commonly found 
in B. breve than B. infantis (present in 119 of 
129 B. breve MAGs and 41 of 69 B. infantis 
MAGs recovered from industrialized infants; 
P = 14 x 107°, Fisher’s exact test). The per- 
sistence of the H5 cluster beyond 12 months 
in industrialized infants—a time period in 
which breastfeeding is less common in these 
populations—suggests this cassette of genes 
exists in genomes that are not reliant upon 
breastfeeding. Breast milk consumption among 
industrialized infants reduces—but does not 
eliminate—lifestyle-associated differences in 
B. infantis and HMO-degradation cassette 
prevalence (fig. S8). 

We next investigated strain-level differences 
among B. infantis genomes recovered from 


Fig. 1. Age and lifestyle are associated with 
infant microbiome composition. (A) Unweighted 
UniFrac dissimilarity Principal Coordinates Analysis 
(PCoA) (top left panel) of 1900 fecal samples 

from infants (<3 years old) across 18 populations 
based on amplicon sequence variant abundance. 
Point color indicates lifestyle and point size is 
proportional to age in months. Boxplots show the 
distribution of indicated age groups along PCol 
bottom) and cohorts along PCo2 (right). (B) PCo2 
versus sample age for the three lifestyle categories 
solid lines) and specific indicated subpopulations 
dashed lines). The purple dashed line includes 
Russia (Karelia) and South Africa [RU (Karelia) + SA] 
and the green dashed line includes Malawi, Nigeria 
Urban), and Bangladesh (MWI + NG + BD). The 
middle transitional line (blue) contains all transi- 
ional samples. Lines are the smoothed conditional 
mean of PCo2 loadings (loess fit). (©) Relative 
abundance of CAGs by age group and lifestyle. Taxa 
in annotation are the most abundant taxa in a CAG. 


infants aged 0 to 1years old (n = 96 MAGs). 
Several lifestyle-associated functional differ- 
ences were discovered including (i) enrich- 
ment of glycoside hydrolase family 163 (GH163), 
a CAZyme involved in the utilization of com- 
plex N-glycans (including those found on 
immunoglobulins), in nonindustrialized ver- 
sus industrialized infants (25) (fig. S9, A and 
B), (ii) differential prevalence of three Pfams 
(including one related to flagellar assem- 
bly) (fig. S9C), and (iii) increased preva- 
lence of four uncharacterized gene clusters 
in MAGs from nonindustrialized versus in- 
dustrialized infants (fig. S9D). To verify these 
metagenomics-based findings, we isolated 
and sequenced 20 B. infantis strains from the 
same Hadza infant fecal samples (table S3). 
GH163 and all four gene clusters also showed 
enrichment among Hadza B. infantis iso- 
lates as compared to the public reference 
genomes (fig. S9). Finally, strong lifestyle- 
specific phylogenetic clustering was observed 
among B. infantis isolate sequences and 
MAGs (Fig. 2F). This observation of strong 
region-specific phylogenetic signals could 
reflect long-term, multigenerational vertical 
transmission (26). 

To assess the extent of vertical strain trans- 
mission in the Hadza infants, we deeply se- 
quenced fecal samples from corresponding 
Hadza mothers (7 = 23 Hadza dyads). Detailed 
strain-tracking analysis was performed with 
inStrain (27) with a threshold for identical 
strains of 99.999% popANI (table S8). Dyad 
pairs share far more strains (6.4 versus 0.3) 
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and have a higher percentage of strains shared 
(12.4% versus 0.5%) than nondyad pairs on 
average (P < 0.01, Wilcoxon rank-sum test) 
(Fig. 3A). Further, Hadza nondyads living in 
the same bush camp share more strains than 
those living in different bush camps (Fig. 3A) 
(P < 0.01, Wilcoxon rank-sum test), consistent 
with previously reported increased rates of 
strain sharing within Fijian social networks 
(28). Vertical strain sharing was detected among 
a range of phyla in the Hadza (Fig. 3B) and was 
higher among Bacteroidota and Cyanobacteria 
and lower among Firmicutes (Fisher's exact test 
with false discovery rate correction). Industrial- 
ized infants also exhibited increased and de- 
creased vertical strain sharing of Bacteroidetes 
and Firmicutes, respectively (29). These results 
suggest that community interaction during 
rearing of infants and/or bush camp micro- 
environments may propagate group micro- 
bial sharing (30). 

The same detailed strain-tracking analysis 
was next performed on a comparative dataset 
of 100 dyads from Sweden (31). Swedish and 
Hadza infants were 1.01 + 0.00 and 0.95 + 
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0.21 years old, respectively (P = 0.04, Wilcoxon 
rank-sum test); in addition, Swedish mothers 
were sampled immediately after birth whereas 
Hadza mothers were sampled contemporane- 
ously with infants. Swedish infants born via 
C-section were excluded from this analysis 
(n = 17 eliminated) and in silico rarefaction 
was performed to account for differences in 
sequencing depth between the studies. Just 
as Prevotella and Bacteroides are enriched in 
nonindustrialized and industrialized infants, 
respectively (Fig. 1C), Prevotella and Bacteroides 
strains are more commonly vertically shared 
in Hadza and Swedish dyads, respectively (Fig. 
3C; Fisher's exact test; P < 0.01). Similar trends 
are observed for VANISH and BloSSUM taxa 
(Fig. 3C). The species more abundant in ma- 
ternal samples were more likely to be ver- 
tically transmitted (fig. S10); however, the small 
difference in infant age between populations 
may contribute to some differences. The find- 
ings suggest that vertical transmission may be 
a mechanism by which microbiota change is 
propagated over generations in response to 
altered lifestyles (32-34). 


Fig. 2. Age and lifestyle are associated with infant 
microbiome functions. (A) PCoA on the basis of 

on 682 infant fecal metagenomes described at the 
gene abundance level in reads per kilobase million 
(RPKM). Points are colored by lifestyle and point size 
indicates infant age in months. Boxplots (bottom) 
show the distribution of indicated age groups in 
months along PCol. Boxplots (right) show the 
distribution of each lifestyle along PCo2. The main 
axis of variation in this gene-based ordination is 
significantly associated with age (EnvFit; R? = 0.30: 
n = 679; P = 0.001) and the second axis of variation is 
significantly associated with lifestyle (EnvFit; R? = 0.35; 
n = 679; P = 0.001). (B) Prevalence of species across 
lifestyles among infants 0 to 6 months old. VANISH 
(red and green) and BloSSUM (blue) species with the 
lowest adj-P values have text annotations. B. infantis 

is shown in orange. “Other” taxa (gray) are those that do 
not significantly differ according to lifestyle. (C) Relative 
representation of four common Bifidobacterium 

species in infants 0 to 6 months old by lifestyle. 

(D) Scatterplot of B. infantis versus B. breve abundance 
among infants 0 to 6 months old. Contour lines display 
the kernel density estimation. (E) Prevalence of HMO- 
utilization clusters across ages and lifestyles. Clusters 
are considered present if all genes in the cluster 

are detected above a variable coverage threshold 

(to ensure that results are robust to differences in 
sequencing depth; see methods for details). * indicates 
adj-P < 0.05; Fisher's exact test with false discovery 
rate correction; nonindustrialized versus industrialized. 
(F) Phylogenetic tree of B. infantis genomes based on 
universal single copy genes. Genome names are colored 
on the basis of lifestyle of origin. Isolate genomes are 
marked with a checkmark. Public reference genomes for 
B. longum and B. infantis are included (gray text). 


Taken together, our data show that infants 
from all lifestyles begin life with similar 
Bifidobacteria-dominated gut microbiota com- 
positions, but subtle differences detected in 
early life compound over time. Differences in 
the species composition and HMO-degradation 
genes of the initially dominant Bifidobacterium 
communities are especially relevant as recent 
studies of these same genes suggest that their 
depletion in industrialized infants could have 
long-term negative immune consequences (24). 
The same taxa that differentiate lifestyles at 0 to 
6 months of life are those that are most com- 
monly vertically transmitted, suggesting that 
vertical transmission may help establish alter- 
native development trajectories. Crucially, infants 
living transitional lifestyles display interme- 
diate phenotypes between those of industrial- 
ized and nonindustrialized infants in almost 
all analyses performed. Although not conclu- 
sive, this is an important piece of evidence 
pointing to lifestyle as a possible causative fac- 
tor in infant microbiome assembly. The Hadza- 
specific discoveries reported in this work 


(including the finding of increased nondyad 
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Fig. 3. Strain sharing between mother-infant dyads and nondyads is lifestyle-specific. (A) The mean 
strains shared (left) and the percentage of infant strains found in mothers (right) in mother-infant 

dyads versus mother-infant nondyads (top) and nondyads from the same bushcamp versus nondyads 
from different bushcamps (bottom). Error bars represent standard error (*, adj-P < 0.05; ** adj-P < 0.01; 
*** adj-P < 0.001; Wilcoxon rank-sum test). (B) Percentage of strains detected in all Hadza mothers 

and infants and whether they are detected in infants only, mothers only, or shared within a mother-infant 


dyad (“shared”) categorized by phylum. Numbers to the 


right of bars indicate the number of vertically shared 


strains over the number of strains detected in either infant or maternal samples. Phyla with a significant 
difference in the percentage of vertically transmitted strains as compared with all other phyla are marked 
with asterisks (Fisher's exact test with P value correction). (€) Percentage of vertically transmitted strains 
in Hadza and Swedish cohorts by phylum (top), genus (middle; only genera with significant differences 
shown), and VANISH / BloSSUM (bottom). All metagenomes were subset to 4Gbp for this analysis to 
remove any biases associated with sequencing depth. Taxa that are significantly enriched in either cohort are 
marked with an asterisk (* adj-P < 0.05; ** adj-P < 0.01; *** adj-P < 0.001; Fisher's exact test). 


vertical transmission among members of the 
same bush camp, a social structure with no 
equivalent among industrialized commu- 
nities) exemplify the importance of studying 
people outside of industrialized nations and 
highlights the need for additional studies to 
provide equity in understanding microbiomes 
across global societies. Our results also high- 
light the question of whether lifestyle-specific 
differences in the gut microbiome’s develop- 
mental trajectory predispose populations to 
diseases common in the industrialized world, 
such as those driven by chronic inflammation 
(35, 36). 
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Doubly stereoconvergent crystallization enabled by 


asymmetric catalysis 


Pedro de Jestis Cruz, William R. Cassels, Chun-Hsing Chen, Jeffrey S. Johnson* 


Synthetic methods that enable simultaneous control over multiple stereogenic centers are desirable 
for the efficient preparation of pharmaceutical compounds. Herein, we report the discovery and 
development of a catalyst-mediated asymmetric Michael addition/crystallization—induced diastereomer 
transformation of broad scope. The sequence controls three stereogenic centers, two of which are 
stereochemically labile. The configurational instability of 1,3-dicarbonyls and nitroalkanes, typically 
considered a liability in stereoselective synthesis, is productively leveraged by merging enantioselective 
Bronsted base organocatalysis and thermodynamic stereocontrol using a single convergent 
crystallization. The synthesis of useful y-nitro B-keto amides containing three contiguous stereogenic 
centers is thus achieved from Michael acceptors containing two prochiral centers. 


he development of robust, stereoselec- 

tive synthetic methods that achieve 

precise control in the simultaneous con- 

struction of multiple stereogenic centers 

is crucial to accelerating the discovery 
of the next generation of drugs (/-7). The 
increase in the stereochemical complexity 
of such compounds has inspired academic 
and industrial chemists to invent modern 
and more efficient methods to facilitate their 
construction. Enantioselective catalysis is a 
powerful and broadly applicable paradigm 
that has been used extensively, and the avail- 
ability of myriad mechanistic manifolds that 
can be used in the construction of C-C and 
C-heteroatom bonds render this blueprint 
especially attractive (8). With notable excep- 
tions (9), the discovery of enantioselective 
catalytic reactions tends to focus on how to 
maximize stereoselectivity (transition state 
focus), whereas issues around postreaction 
processing and optimization of physical prop- 
erties of the products (ground state focus), 
considerations that are critical in fields such 
as polymer chemistry or process chemistry, 
tend to be neglected. An unfortunate by- 
product of this bifurcation in focus is that 
even for relatively efficient reactions, the prac- 
ticing chemist is often faced with laborious 
energy- and resource-intensive purifications 
that limit applicability on larger scales (10). 
This issue is aggravated when valuable matter 
is lost as undesired stereoisomers. 

To circumvent the inherent issues of ordi- 
nary purification techniques (e.g., flash col- 
umn chromatography, high-pressure liquid 
chromatography, etc.), the crystallization or 
precipitation of products from reaction mix- 
tures substantially simplifies isolation while 
decreasing the amount of waste generated, 
time spent, and energy required (11). For re- 
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actions that generate stereoisomeric mixtures, 
simple recrystallizations cannot overcome 
the fact that valuable material will be lost 
as undesired stereoisomers; however, for cases 
in which equilibration between stereoisomers 
is mechanistically feasible, convergence to a 
single stereoisomer of the product can be 
achieved by engineering crystallization-induced 
diastereomer transformations (CIDTs). CIDTs 
are highly desirable in synthetic chemistry and 
industrial applications because they provide 
a means to generate highly stereoenriched 
products without requiring additional tedi- 
ous purifications (72, 13). CIDT selectivity is 
governed by crystallization thermodynamics: 
Diastereomers undergoing CIDT contain one 
or more static asymmetric centers and at least 
(and commonly) one labile element of chiral- 
ity that is subjected to equilibration. Imple- 
mentation of CIDT strategies into synthetic 
routes reduces the effort required to access a 
single stereoisomer of a complex molecule: 
100% theoretical yield of a single stereoiso- 
meric product can be obtained from an initial 
mixture of interconverting epimers. CIDT re- 
actions are usually applied to specific prob- 
lems in industrial chemistry, they are difficult 
to predict, and generalizable non-auxiliary- 
based CIDT methods are currently underdevel- 
oped (12, 13). We were interested in testing the 
notion that by leveraging epimerization to 
achieve stereoconvergence in complex systems, 
we could accrue unique advantages through 
the merged application of asymmetric catalysis 
and crystallization-driven selectivity. 

The base-catalyzed Michael reaction between 
two prochiral reaction partners was judged 
to be an ideal test case to evaluate such a hy- 
pothesis: Asymmetric variants comprise atom 
economical skeletal assemblies in organic 
synthesis and have been shown to proceed 
efficiently with a variety of catalytic platforms 
(14). For the projected application, the obliga- 
tory electron-withdrawing groups in the start- 
ing materials that enable the polar C-C bond 
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construction should also engender multiple 
acidic C-H sites in the product required to es- 
tablish the requisite complex equilibria; how- 
ever, a parallel consideration highlighted in 
Fig. 1A is that the complexity of the system 
rises geometrically as the number of configu- 
rationally unstable asymmetric centers grows 
d.a S I.b S Ie S P). For this reason, an 
enantioselective reaction in which a dual-role 
catalyst both mediates the installation of a 
keystone stereocenter and induces completely 
convergent crystallization of a product with 
two labile centers is unknown (J5-17). Here, 
we disclose such an advance, in which the 
combination of bifunctional Bronsted base 
asymmetric organocatalysis (18) with CIDT 
principles enables the stereoconvergent syn- 
theses of y-nitro B-keto amides containing three 
contiguous asymmetric centers from two prochi- 
ral reaction partners through CIDT reactions 
with considerable scope and downstream utility. 

Foundational studies were initiated to assess 
the viability of a stereoconvergent crystalliza- 
tion using nitromethane (la) as a pronucleo- 
phile and the prochiral Michael acceptor 2 
(Fig. 1B). Under homogeneous conditions (see 
the supplementary material for details), the 
conjugate addition product was obtained in 
high yields, although the stereochemistry of 
the B-dicarbonyl stereogenic center was un- 
controlled, as expected [85% yield, 1.1:1 dia- 
stereomeric ratio (dr)]. The use of ethereal 
solvents [e.g., diethyl ether, methyl tert-butyl 
ether (MTBE)] allowed for selective crystalli- 
zation of a single diastereomer of the conjugate 
addition adduct directly from the reaction 
medium (see the supplementary materials 
for solvent studies). Reaction concentrations 
and temperatures were moreover optimized 
to prevent the spontaneous precipitation of 
starting material and isomerically impure 
product (if the reaction was too concen- 
trated) or loss of product in the filtrate (if the 
reaction was too dilute). Concurrent with 
the optimization of the reaction-based crys- 
tallization protocol, we also examined the fea- 
tures of the Bronsted base catalyst and their 
effect on the stereoselectivities and efficiency 
of the protocol (see the supplementary mate- 
rials for catalyst optimization). We found that 
the chiral Dixon iminophosphorane A (18) 
successfully engaged in the proposed stereo- 
convergent crystallization, giving B-keto amide 
3a with excellent yields and enantioselectivity 
and diastereoselectivity [yield 96%, enantio- 
meric ratio (er) 94:6, dr >20:1] after a single 
filtration of the reaction. 

This stereoconvergent crystallization pro- 
tocol could be successfully applied to a range 
of substituted alkylidenes that were converted 
into their corresponding crystalline nitro ke- 
tone adducts in good to excellent yields and 
stereoselectivities (Fig. 2). Nitro ketones con- 
taining halogens (3b and 3k), alkyl (Se and 
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A Merging Enantioselective Catalysis and CIDT in Conjugate Addition Reactions 
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3d), and electron-withdrawing (3e to 3h) 
groups at various positions were obtained in 
good to excellent yields and stereoselectivities 
starting from unsymmetrical aryl alkylidenes. 
A product containing synthetically useful bo- 
ronic ester 3i was synthesized in moderate 
yields and excellent stereoselectivities after 
the stereoconvergent crystallization proce- 
dure. p-Dimethylamine- and naphth-2-yl- 
derived alkylidenes successfully engaged in 
this reaction delivering nitroalkanes 3j and 31 
in good yields and stereoselectivities. A variety 
of heteroaryl products, including furan-3-yl 
(8m), thien-2-yl (8n), N-methyl pyrrol-2-yl (30), 
pyrid-2-yl (3p), and Boc-protected indol-3-yl 
(83q), were obtained in equally efficient and 
enantioselective reactions. Several ketone and 
amide substrates were tested in this reaction 
(Fig. 3). Aryl ketone products containing halo- 
gens (3r), alkyl (3s), and electron-donating 
groups (3t) were obtained in good to excellent 
yields and enantioselectivities. Alkyl ketone 
substrates were tolerated under the reaction 
conditions, although nitroalkane (3u) was 
obtained at a low enantiomeric ratio. A variety 
of amides were suitable substrates for this 
protocol. Piperidine and synthetically useful 
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Weinreb (19) amide substrates engaged in 
fully enantio- and diastereoselective stereocon- 
vergent crystallization to deliver the corre- 
sponding nitroalkanes (3v and 3w) in good 
yields. Last, diisopropyl amide alkylidenes 
delivered the corresponding nitroalkanes with 
excellent efficiency (3x to 3z). In some in- 
stances (3y and 3z), the diisopropyl amide 
outperformed the morpholine amide (3f and 
3k), and improvements in the enantio- and 
diastereoselectivity were observed (see below). 
These preliminary results suggest that the 
identity of the acyl group could be tuned to 
improve the efficiency of the crystallization 
and can be a point of diversification when op- 
timizing related CIDT reactions. 

Extensive x-ray diffraction studies were un- 
dertaken to evaluate product stereochemistry. 
As predicted based on a paradigm involving 
kinetic (catalyst) control of the static asym- 
metric center, x-ray analysis showed that the 
6-configuration is rigorously conserved for the 
Michael addition products 3a, 3f, 3h, 3i, 3j, 
8n, 3t, and 3x. The fluxional nature of the 
a-stereocenter could in principle lead to differ- 
ent results across the series based on solubil- 
ity properties, but the outcomes here were 
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consistent as well: The (S)-configuration 
was observed at the B-keto amide methine 
in the crystallized products. 

Having established a method to control the 
a-stereogenic center through thermodynamically 
driven stereoconvergence, attention was then 
directed to expanding this platform to the use 
of prochiral nucleophiles. We began our studies 
with the reaction of p-trifluoromethylphenyl 
alkylidene using nitroethane as the prochiral 
nucleophile. Under identical conditions (see 
the supplementary materials), nitroalkane 
product 4e crystallized from solution and 
was obtained in 95% yield, 95:5 er, and 6:1 dr 
after a single filtration. In situ monitoring 
revealed that the Michael addition was rela- 
tively fast, whereas diastereomerization and 
crystallization were slower. Accordingly, in- 
creasing the catalyst loading to 20 mol % and 
changing the solvent to 2-methyltetrahydrofuran 
improved the overall efficiency of the CIDT 
(see the supplementary materials for addi- 
tional details), presumably due to accelerated 
interconversion of the diastereomeric mixture. 
Under the optimized conditions, the desired 
conjugate addition product was obtained in 
excellent yield and stereoselectivity (Fig. 3). 
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Fig. 2. Substrate scope of the crystallization-enabled stereoconvergent 
conjugate addition. Reaction conditions were as follows: cat. A (5.0 mol %), 
alkylidene (0.200 mmol, 1.0 equiv, 0.5 M), MeNOz (0.600 mmol, 3.0 
equiv), MTBE (0.400 ml), O°C, 16 hours. Yields refer to isolated yields. 
The dr values were determined by 1H NMR spectroscopic analysis of the 
solid obtained by filtration of the crude reaction mixture. The er values of 
the solids after filtration were determined by high-performance liquid 
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dr >20:1 | er >99:1 


3y (R = CN) 3z 
yield 94% cea] yield 78% 
dr >20:1 | er 96:4 3x Dray] dr >20:1 | er >99:1 


CCDC 2149930 


chromatographic analysis using a chiral stationary phase. *Reaction 
performed at 23°C. tReaction performed in 2-Me-THF as solvent. 
tReaction performed with 5 equiv MeNO>. §Reaction performed using 
MTBE:DCM (5:1) as a solvent to help solubilize alkylidene starting 
material. §Reaction performed with 15 equiv of MeNOz to help solubilize 
alkylidene starting material. #Reaction performed in Et20 as solvent. 
**Crystallization occurred after 96 hours. 
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Fig. 3. Substrate scope of the crystallization-enabled doubly stereo- 
convergent conjugate addition. Reaction conditions were as follows: cat. 

A (20 mol %), alkylidene (0.200 mmol, 1.0 equiv, 1 M), EtNO2 (0.600 mmol, 
3.0 equiv), 2-Me-THF (0.200 ml), 23°C, 48 hours. Yields refer to isolated 
yields. The diastereomeric ratio values were determined by *H NMR 
spectroscopic analysis of the solid obtained by filtration of the crude 
reaction mixture. The er values of the solids after filtration were determined 
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Ketone (R?): 


Amide (NR2): 


yield 72% 
dr >20:1 | er >99:1 


yield 78% 
dr 9.3:1 | er 53:47 


ON 


4vt 
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dr 12:1 | er >99:1 


4v [X-ray] 
CCDC 2168284 


by high-performance liquid chromatographic analysis using a chiral station- 
ary phase. *Reaction performed in Et20 as solvent. {Reaction performed with 
1-nitropropane (0.600 mmol, 3.0 equiv). Reaction performed with MTBE 
(0.2 ml, 1.0 M). §Reaction performed with nitroethanol (0.600 mmol, 3.0 
equiv). Reaction performed with ethyl nitroacetate (0.600 mmol, 3.0 equiv). 
#Reaction performed with (nitromethyl)benzene (0.600 mmol, 3.0 equiv). 
**Crystallization observed after 96 hours. 


4 of 7 


RESEARCH | REPORT 


A 


Cc, and Cc, are configurationally unstable 


x, }e 


in situ ‘'H NMR spectroscopic study of D/H exchange 


AN, 3 tg t=0h Ha Hs DH, | 
HY} —=D | 
OoN Ph | 
MeO 4b-d> (88%-/) ® | 
cat. A (20 mol %) dr >20:1 mm oe ad Nie 
MeOH (20 equiv) aa age 1.00 
CDCl, (0.2 M) Jy t=48h 
rt, 48h Q r ° @ 
Ar = p-CeHaF — H i 
ta | hk je i 
OoN—Ky Ph N Mo AS AL 


s°@ MeO 
>95% H 4b 


oe 
0.23 0.53 102 0120.10 1,00 


diastereomer mix 


Diastereoselectivity is driven by crystallization 


1.60 5.50 5.40 5.30 5.20 5.10 5.00 4.90 4.80 4.70 4.60 4.50 4.40 4.30 4.20 4. 


f1 (ppm) 


F3C 


F3C 
ae cat. A (20 mol %) 
¢ y EtNO, (3 equiv) 
Nn 2-Me-THF (1.0 M) 

\ Ph rt, 48h 


oO . 


°° Homogeneous Heterogeneous 


o7 


06 


05 


mole fraction 


bs kak 


i 
' 
' 
1 
i 
' 
' 
! 
! 
04 I 
1 
! 
1 
' 
1 
1 
1 
! 


on fi 
oO 


mde 

mdiast 2 
diast 3 

mdiast 4 


time (h 


Crossover experiment: Michael addition is irreversible 


Sf 


AY t. A (20 mol % “ Nf AL N( 
Hed H + CH3NO, cat. A (20 mol %) - 
CDCl3 (0.2 M) 
OoN es Ph (20 equiv) rt. 48h O2N Ph OoN Ph 
H MeO 4b Ar = p-CgH4F MeO 4b O 3b 
NO2 q Jo ao A 
2O Ar 1. iT fe} fe) 
R —_ Ny f 
Looe s HEE 5@) nor Ar Q@+-H = N( 
N Ph & 
| b oO R y, Ph | no crossover 
/ observed 
or 6? ti, y, 


Fig. 4. Mechanistic insights into the origin of stereoselectivity. (A) Deuterium/hydrogen exchange 
studies were performed to establish that the catalyst can epimerize both labile stereocenters. (B) An initial 
diastereomeric mixture converges to a major diastereomer through crystallization-driven stereoconvergence. 
(C) Epimerization through retro-Michael addition is mechanistically possible but not operative. 


This result represents a rare example of an 
efficient noncascade Michael addition reac- 
tion between prochiral nucleophilic and elec- 
trophilic reaction partners in which the latter 
is prochiral at both alkene carbon atoms (J4). 

Like the single CIDT reaction, this protocol 
is applicable to a wide range of substrates 
(Fig. 3). A variety of aryl alkylidenes delivered 
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nitroalkanes containing halogens (4b to 4d), 
and electron-withdrawing (4e to 4g) func- 
tional groups. Heterocyclic alkylidenes were 
tolerated (4h to 4k), and substrates includ- 
ing N-methyl pyrrole (4h), and furan-2-yl (41), 
furan-3-yl (4j), and Boc-protected indol-3-yl 
(4k) were suitable substrates for this trans- 
formation. Other prochiral nitroalkanes were 
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tolerated in the doubly stereoconvergent crys- 
tallization manifold and delivered B-keto amides 
containing other alkyl (41), alcohol (4m), ester 
(4n), and aryl (40) functional groups in mod- 
erate to good yields and stereoselectivities. A 
variety of aryl ketone substrates bearing halo- 
gens (4p), electron-withdrawing substitutents 
(4q), and alkyl substituents (47) were also viable 
substrates for this transformation. Ethyl ketone 
substrate delivered the desired nitroalkane in 
excellent yield but low stereoselectivities (4s). 
Finally, a variety of amides were suitable sub- 
strates for this transformation, and nitroalkane 
adducts containing piperidine amide (4), syn- 
thetically useful Weinreb amide (4m), and 
dicyclohexyl amide (4v) delivered the desired 
conjugate addition products in good to excel- 
lent yield and stereoselectivities. 

As in the nitromethane additions, x-ray crys- 
tallography was indispensable in assessing 
the stereochemical outcomes of the high-order 
CIDTs. Effective regulation of the static asym- 
metric B-center was again enabled by the Dixon 
iminophosphorane catalyst. The y-stereocenter 
was conserved in the six of the seven analyzed 
products, with the (S)-configuration regularly 
observed at the nitronate center; the outlier 
(y-(R) configuration) was nitroalkane 4v. A 
comparison of amides 4g and 4 revealed 
inverted configurations at both labile centers 
in the products, triggered only by changing 
the amide identity (morpholine amide versus 
dicyclohexyl amide). Dichotomous stereochem- 
ical behavior at the a-center was observed in 
CIDT reactions giving Michael adducts with 
electron-poor (4e to 4g) and electron-neutral 
or -rich aromatic groups (4d, 4i, and 44m). The 
assignments for those products not yet studied 
by x-ray diffraction must be construed as ten- 
tative at this point, but the ability to fully in- 
vert the obtained major diastereomer in certain 
cases is exciting, and future work will be directed 
at understanding and exploiting the structural 
factors that favor isomer-selective crystallization. 

To gain further insight into the proposed 
doubly stereoconvergent crystallization pro- 
cess, a Series of mechanistic experiments was 
performed (Fig. 4). To understand the stereo- 
lability of C,, and C, in the presence of a chiral 
Bronsted base, deuterium/hydrogen exchange 
studies using isotopically labeled, diastereo- 
merically pure (dr >20:1) nitroalkane 4b-d, 
were performed. Using CH,0H [20 equiva- 
lents (equiv)] as the protic additive, >95% H 
incorporation and stereochemical scrambling 
at C,, and C, were observed in the presence of 
the iminophosphorane A, indicating that both 
centers are susceptible to epimerization by the 
catalyst under the reaction conditions (Fig. 4A). 
We then turned our attention to understanding 
the rates of epimerization and the efficiency of 
the crystallization (Fig. 4B). An identical set of 
parallel reactions were performed simulta- 
neously and quenched at different time points 


5 of 7 


RESEARCH | REPORT 


Fig. 5. Synthetic utility of the crystallization- 
enabled doubly stereoconvergent conjugate addi- 
tion. See the supplementary materials for specific 
reaction details. (A) A 50 g doubly stereoconvergent 
crystallization enabled by catalyst recycling. Reaction 
conditions were as follows: (i) cat. B (20 mol %), 
alkylidene 2 (80.3 mmol, 1.0 equiv), EtNO2 (241.0 mmol, 
3.0 equiv), 2-Me-THF (80 ml, 1.0 M), 23°C, 48 hours. 
(ii) Unpurified homogeneous filtrate containing 
catalyst and excess nitroalkane was charged with 
(80.3 mmol, 1.0 equiv) of prochiral alkylidene 2. 
(iii) Recrystallization from 80% EtOAc:hexanes. 

(B) Diastereoselective synthesis of secondary 

and tertiary alcohols. (iv to viii) nitroalkane 4i 
(0.100 mmol, 1.0 equiv), CeCl3 (0.300 mmol, 3.0 equiv), 
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v. vinylation 5b (CH=CH2) 86% >20:1 
vi. allylation 5c¢ (CHpCH=CH2) 65% >20:1 
vii. alkynylation 5d (C=CMe) 85% >20:1 
viii. arylation 5e (4-CgH,4F) 77% >20:1 
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Reductive Cyclization and Amide Manipulations 
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Yen cyclization oN 
3 (x) s 
: 7 ; 
Ph yield 91% dr>20:1 Me i PA 
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Ai Q= a > 5g reduction 
dr >20:1 | er >99:1 a | (xi) 
@ =CHz ~ 5h ~~ yield 60% 


to quantitatively study the product distribution 
over time using 'H and °F nuclear magnetic 
resonance (NMR) spectroscopic techniques. 
The medium was homogeneous early in the 
reaction, and no significant preference for 
any product diastereomer was observed. A 
mixture of four diastereomers was initially 
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present in solution. The onset of crystalliza- 
tion (¢ ~ 3 hours) initiated product diastereo- 
mer enrichment through CIDT at the expense 
of the three more soluble, equilibrating dia- 
stereomers. Reaction progress was marked 
by steady perturbation of the isomer ratio 
away from the unselective solution equilib- 
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LiCl (0.600 mmol, 6. 
.8 equiv) THF/DCM 


1 equiv), BH3e-DMS 
-78°C, 5 hours. (C) 


O equiv), RMgX (0.180 mmol, 
(5:1), -78°C, 12 hours. (ix) nitro- 


alkane 4i (0.100 mmol, 1.0 equiv), TiCl, (0.110 mmol, 


(0.500 mmol, 5.0 equiv), DCM, 
Reductive cyclization and synthetic 


utility of the amide functional handle. (x) nitroalkane 
4i (1.5 mmol, 1.0 equiv), Zn (45.0 mmol, 30 equiv), 
AcOH, 65°C, 5 hours. (xi) nitroalkane 4i (0.200 mmol, 
.O equiv), LiAIH, (0.800 mmol, 4.0 equiv), THF, 
23°C, 2 hours. 


rium (¢ = 1 hour), continuing over the course of 
the 48-hour reaction to result in the highly 
enriched product. 

Epimerization through retro-Michael reversion/ 
Michael addition (20) was investigated through 
a crossover experiment. Upon exposure of nitro- 
ethane adduct 4b to the action of iminophos- 
phorane A in the presence of excess nitromethane 
(20 equiv) under homogeneous reaction con- 
ditions for 48 hours, none of the crossover 
product 3b was identified; rather, epimerized 
4b and other unidentified decomposition 
products were observed (Fig. 4C). This ex- 
periment underscored an important corollary: 
Crystallization insulates the product from un- 
desired decomposition pathways that occur 
in the homogeneous environment commonly 
favored for organic reactions. 

With a mechanistic understanding in hand to 
rationalize the excellent stereocontrol observed 
in this reaction platform, an assessment of some 
of the practical features of the doubly stereo- 
convergent crystallization was undertaken 
(Fig. 5). A reaction using 25 g of alkylidene 2 
was performed with excess nitroethane. A 
lower-molecular-weight iminophosphorane (B) 
was used, which efficiently catalyzed the desired 
CIDT reaction to deliver nitroalkane 4i in good 
yield and stereoselectivity after a single filtra- 
tion. When the unpurified homogeneous filtrate 
containing catalyst and residual excess nitro- 
ethane was treated with another 25 g charge 
of prochiral alkylidene 2, nitroalkane 4i was 
obtained with nearly identical efficiency in 
the second cycle. The product was enriched to 
enantiomeric homogeneity with good recovery 
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through a single recrystallization (40 g isolated, 
dr >20:1, er >99:1). 

The isolated products of these Michael CIDT 
reactions can be transformed to value-added 
products in subsequent steps that capitalize 
on the embedded functionality without com- 
promising the integrity of the extant asym- 
metric centers (Fig. 5, B and C) (27). Ketone 
additions using organocerium reagents (22, 23) 
delivered a variety of tertiary alcohols containing 
synthetically useful functional handles, includ- 
ing the alkyl (Sa), vinyl (5b), allyl (Se), alkynyl 
(5d), and aryl (5e) groups, in good to excellent 
diastereoselectivity. Secondary B-hydroxy amide 
5f was obtained in 95% yield as a single di- 
astereomer through titanium (IV)-mediated 
diastereoselective reduction (24). Pyrrolidine 
5g was synthesized in a single step in excel- 
lent diastereoselectivity through Zn-mediated 
reductive cyclization (25), and from that point 
the amide could be converted to its derived 
tertiary amine 5h in moderate yield. 

This work establishes a foundation for 
crystallization-induced diastereomer trans- 
formations operating on two configuration- 
ally labile asymmetric centers, enabled in this 
instance by the Dixon chiral iminophosphorane 
Bronsted superbase. The results of the pre- 
sent study suggest that expanded opportu- 
nities may exist for the productive merger 
of divergent, partially selective first-stage 
asymmetric catalysis with crystallization- 
driven second-stage stereoconvergence. A 
key to the generalization and future growth 
of such platforms that capitalize on their 
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myriad benefits will be the development of 
robust predictive tools based on, among other 
things, analysis of crystal packing and ma- 
chine learning. 
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Enantioselective hydrogen-bond-donor catalysis to 
access diverse stereogenic-at-P(V) compounds 


Katherine C. Forbes and Eric N. Jacobsen* 


The stereoselective synthesis of molecules bearing stereogenic phosphorus(V) centers represents 
an enduring challenge in organic chemistry. Although stereospecific nucleophilic substitution 

at P(V) provides a general strategy for elaborating optically active P(V) compounds, existing 
methods for accessing the requisite chiral building blocks rely almost entirely on diastereocontrol 
using chiral auxiliaries. Catalytic, enantioselective methods for the synthesis of synthetically 
versatile stereogenic P(V) building blocks offer an alternative approach to stereogenic-at-P(V) 
targets without requiring stoichiometric quantities of chiral control elements. Here, we report 

an enantioselective hydrogen-bond-donor-catalyzed synthesis of aryl chlorophosphonamidates and 
the development of these products as versatile chiral P(V) building blocks. We demonstrate that 
the two leaving groups on these chlorophosphonamidates can be displaced sequentially and 
stereospecifically to access a wide variety of stereogenic-at-P(V) compounds featuring diverse 


substitution patterns. 


hosphorus(V) stereocenters are present 
in a wide assortment of important mol- 
ecules, including several recently de- 
veloped pharmaceuticals (Fig. 1A). The 
absolute stereochemistry at phosphorus 
is often directly associated with the biological 
activity of those molecules (/-7). Stereogenic- 
at-phosphorus compounds also serve as broad- 
ly useful ligands and catalysts in asymmetric 
organic synthesis (8, 9). Although a variety of 
natural products bearing P-stereogenic centers 
have been identified (JO), these molecules are 
not practical synthetic building blocks owing 
to their sparsity. Thus, whereas the synthesis of 
compounds bearing C-stereogenic centers has 
historically drawn heavily on nature’s chiral 
pool (11), access to P-stereogenic molecules re- 
lies entirely on de novo synthesis. Nucleophilic 
substitution at stereogenic P(V) centers can 
occur stereospecifically, thereby providing a 
powerful strategy for the synthesis of complex, 
optically active compounds from simple P(V) 
building blocks bearing one or more leaving 
groups attached to phosphorus (9, 17-13). 
Effective methods for accessing stereogenic- 
at-phosphorus targets have relied primarily on 
the use of covalently attached chiral auxiliaries 
to achieve diastereocontrol, and a variety of 
chelating auxiliaries have been developed suc- 
cessfully for this purpose (Fig. 1B) (14-22). 
Their applicability depends on stereospecific 
displacement of the auxiliary to forge P(V) 
stereocenters with absolute stereocontrol. 
Among recent advances using the chiral 
auxiliary approach, Baran and colleagues re- 
ported the development of highly reactive 
oxathiaphospholane-sulfide building blocks 
(19, 20). The propensity of the P-S bonds in 
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these building blocks to undergo substitution 
by both alcohols and organometallic reagents 
was demonstrated and enables the synthesis 
of a variety of stereogenic-at-P(V) compounds, 
ranging from oligonucleotides to chiral phos- 
phine oxides. 

Despite important advances in the stereo- 
selective synthesis of chiral P(V) compounds by 
the chiral auxiliary approach, there are both 
practical and fundamental motivations for de- 
veloping asymmetric catalytic strategies toward 
these targets. In that vein, there have been sev- 
eral recent breakthroughs (Fig. 1B). DiRocco 
and co-workers developed a chiral bisimidazole- 
catalyzed synthesis of phosphoramidate pro- 
drugs through the diastereoselective addition 
of nucleosides to chlorophosphoramidates, pro- 
ceeding via a cooperative mechanism of covalent 
activation of P(V) and general-base activation 
of the alcohol nucleophile (23). An alternative 
approach was demonstrated by Miller and co- 
workers in the catalytic, stereodivergent syn- 
thesis of P-stereogenic oligonucleotides from 
phosphoramidites via chiral phosphoric acid 
catalysis (24). Finally, in work that appeared 
as the present study was being completed, 
Dixon and co-workers reported a catalytic, 
enantioselective desymmetrization of diary] 
phosphonate esters by substitution with ortho- 
substituted phenols (25). Although high levels 
of stereoselectivity were achieved in these 
catalytic, nucleophilic substitution reactions, 
each is limited to a narrow class of nucleophiles 
that are not further displaced. We conceived 
that the catalytic, enantioselective installation 
of a nucleophile that could further serve as a 
leaving group for stereospecific substitution at 
P(V) could provide a generalizable strategy for 
the synthesis of chiral P(V) targets with the 
broad synthetic scope of state-of-the-art auxil- 
iary approaches while avoiding the need for the 
stoichiometric use of chiral control elements. 
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We selected chlorophosphonamidates as 
potential targets of an enantioselective cat- 
alytic approach (Fig. 1C). The chloride and 
amino groups on P(V) display orthogonal re- 
activity that might permit sequential and 
stereospecific displacement en route to chiral 
P(V) targets bearing a broad range of sub- 
stitution patterns. Given that P-Cl bonds in 
particular are susceptible to substitution by a 
wide variety of nucleophiles (26-28), chloro- 
phosphonamidates would be highly versatile 
precursors to a multitude of P(V) frameworks. 
We report here the development of an enan- 
tioselective method for the synthesis of chlo- 
rophosphonamidate intermediates using a 
commercially available hydrogen-bond (H-bond) 
donor catalyst, as well as the application of 
these P(V) building blocks to the synthesis of 
P(V) compounds featuring diverse substitu- 
tion patterns. 

We recognized that a most concise enantio- 
selective synthesis of chlorophosphonamidates 
would be realized using a catalytic desymmet- 
rization reaction of phosphonic dichlorides 
with amines. Dual H-bond donor catalysts have 
been applied broadly and successfully to pro- 
mote stereoselective nucleophilic substitution 
reactions via chloride-abstraction pathways 
(29-32), and we hypothesized that this reac- 
tivity principle could serve to activate one of 
the two enantiotopic chlorides of a phosphonic 
dichloride electrophile toward displacement 
by an amine. Phenyl phosphonic dichloride 2a 
was selected as a model substrate in reactions 
with various amine nucleophiles and potential 
chiral catalysts (Fig. 2). The chlorophosphona- 
midate products were found to be too reactive 
to isolate in pure form, but solutions of 3 were 
stable and could be separated from other re- 
action components by filtration through silica. 
Epimerization of chlorophosphonamidate 3 
was not observed under the catalytic condi- 
tions, even in the presence of added tetrabutyl- 
ammonium chloride. However, concentrated 
solutions of 3 underwent racemization slowly 
at room temperature over several hours (table 
$10). For purposes of isolation and analysis, 
the chlorophosphonamidates were quenched 
with sodium methoxide at low temperature to 
produce the corresponding phosphonamidates 
(e.g., 4a). After systematic evaluation of a se- 
ries of chiral dual H-bond donor catalysts and 
amine nucleophiles, the sulfinamido urea la 
(33, 34) was found to promote the nucleophilic 
substitution by diisoamylamine in 95% enan- 
tiomeric excess (ee) and quantitative yield 
(Fig. 2A; see supplementary materials for op- 
timization studies). Multiple equivalents of 
amine were required to attain full conversion 
of 2a, as the amine functions both as a nucleo- 
phile and as a stoichiometric Bronsted base to 
trap the HCl by-product produced in the re- 
action. Examination of the role of catalyst 
structure revealed the importance of both 
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A Examples of phosphonamidates, phosphonates, and phosphinates bearing 
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Fig. 1. Methods for accessing stereogenic P(V) targets. (A) Representative 
bioactive compounds bearing P-stereogenic centers. (B) Synthetic approaches to 
stereogenic-at-P(V) targets using chiral auxiliaries (14-22) and stereoselective 
catalysis (23, 24). (C) A general approach to chiral P(V) building blocks via 


the H-bond donor and the sulfinamide group 
in promoting high enantioselectivity. Whereas 
sulfinamido urea 1a and its thiourea analog 
Ib proved similarly effective as catalysts, the 
sulfinamide 1d lacking the H-bond donor 
motif induced little acceleration above the 
uncatalyzed rate (83 versus 64% yield after 
24 hours) and afforded only racemic product. 
The sulfinamido urea le epimeric to 1a also 
induced severely diminished enantioselectivity, 
a stereochemical “mismatch” effect that has 
also been observed in other applications of 
this catalyst (33, 34) and one that is strongly 
suggestive of cooperative participation of 
the H-bond donor and the sulfinamide in 
the enantiodetermining step. Arylpyrrolidino 
(thio)ureas such as le, If, and 1g, which have 
proven useful in a wide range of asymmetric 
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anion-binding pathways (35) but lack the 
sulfinamide moiety, were catalytically active 
but generally poorly effective with respect to 
enantiocontrol. The enantioselectivity of the 
substitution was also closely tied to the iden- 
tity of the amine, with diisoamylamine under- 
going reaction with distinctly superior results 
relative to any of the other nucleophiles ex- 
amined (Fig. 2B). Beyond a beneficial effect of 
distal alkyl branching, it is difficult to discern 
any straightforward correlation between the 
steric or electronic properties of the amine and 
enantioselectivity in the substitution reaction. 
Control studies suggest that the properties of 
the dialkylammonium chloride by-products 
likely play a critical and complex role in in- 
fluencing the observed enantioselectivity, either 
as inhibitors of the anion-binding H-bond do- 


10 June 2022 


targets of interest 
R=OR, SR, NRo, 
alkyl, aryl 


enantioselective catalysis or stereospecific substitution. Ar, aryl; OH, hydroxy group; 
MeO, methoxy group; OEt, ethoxy group; 'PrO, isopropoxy group; Me, methyl: 
OPh, phenoxy group; Ph, phenyl; BnO, benzyloxy group; Et, ethyl; OMe, methoxy 
group; R, alkyl group; Ts, para-tolylsulfonyl; RO, alkoxy group; tBu, tert-butyl. 


nor catalyst or by promoting a racemic path- 
way between 2a and the dialkylamine (tables 
S8 and S9). 

High levels of enantioselectivity were achieved 
in the reaction of a variety of aryl phosphonic 
dichlorides with diisoamylamine (Fig. 3A). 
Substrates bearing arenes with either electron- 
withdrawing or electron-donating substituents 
underwent substitution with consistently high 
levels of enantioselectivity (4b to 4g). In con- 
trast, alkyl phosphonic dichlorides are ineffec- 
tive substrates for the enantioselective reaction. 
For example, hexylphosphonic dichloride was 
converted to the corresponding phosphona- 
midate in only 26% ee and 50% yield under 
the catalytic conditions. 

The products of the enantioselective reac- 
tions feature two chemically distinct leaving 


2 of 6 


RESEARCH | REPORT 


A Catalyst optimization 
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Fig. 2. Optimization studies. Yield values reflect product quantification 

by P nuclear magnetic resonance relative to an internal standard. Reactions 
were carried out using a one-pot procedure without purification of 3. 
Concentration values correspond to the initial concentration of the limiting 
stoichiometric reagent. (A) Catalyst optimization for enantioselective 
reaction of diisoamylamine with pheny! phosphonic dichloride. Reactions were 


carried out on a 0.06-mmol scale. (B) Optimization of amine structure for 
enantioselective substitution reaction with pheny! phosphonic dichloride. 
Reactions were carried out on a 0.06-mmol scale. The single-asterisk 
symbol indicates that reaction was performed at -40°C for 48 hours. 

R’, alkyl group; ‘Am, isoamyl; Et.0, diethyl ether; THF, tetrahydrofuran; 
‘Bu, isobutyl: ‘Pr, isopropyl. "Bu, n-butyl. 


groups on phosphorus that could be selec- 
tively and stereospecifically displaced to afford 
access to multiple classes of chiral P(V) com- 
pounds. We first explored the scope of nucleo- 
philes capable of enantiospecific displacement 
of the remaining chloride (Fig. 3B). Reaction of 
3 with alkoxides, phenoxides, thiolates, depro- 
tonated carbamates, and Grignard reagents 
afforded the desired products with high levels 
of enantiospecificity (es) in all cases (5a to 5h). 
The substitution reactions could be performed 
after the enantioselective catalytic step with 
or without purification of 3 in solution (see 
supplementary materials for details). We 
found that the reactions could be scaled up 
without loss of enantioselectivity or yield; thus, 
the synthesis of 5d was performed by the one- 
pot procedure on a 3-mmol scale with 5 mol % 
catalyst, affording 1.11 g of product in 95% yield 
and 92% ee (Fig. 3D). 

The products of the chloride-displacement 
reactions could be further elaborated to afford 


alkoxy-substituted P(V) compounds via an 
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acid-mediated stereoinvertive displacement 
of the diisoamylamino group (Fig. 3C). Sub- 
stitution of 5a to 5h with methanol yielded 
a variety of enantioenriched phosphonates, 
phosphinates, and phosphonamidates (6a to 
6h) with nearly complete enantiospecificity 
observed in every case. The slightly diminished 
stereospecificity observed with 5g and 5h is 
consistent with prior observations (/4, 16). 
Substitution with other primary alcohols pro- 
ceeded with varied but generally high levels of 
enantiospecificity (6i to 6k). 

The phosphonate ester and thioester products 
6b and 6d have additional readily displace- 
able substituents that render them useful syn- 
thetic building blocks for further elaboration 
to chiral P(V) compounds. For example, phos- 
phonate thioester 6d underwent reaction with 
functionally complex alcohols to furnish the 
corresponding phosphonylated biomolecules 
with high levels of stereospecificity (7a to '7e) 
(Fig. 4A). These substitutions are performed 
under Bronsted acid-free conditions using 
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little or no excess of the alcohol reagent, high- 
lighting the utility of 6d for the phosphonyl- 
ation of precious or acid-sensitive alcohols. 
Phosphonate 6b underwent efficient substitu- 
tion with Grignard reagents with displace- 
ment of the electron-deficient aryloxide to 
yield highly enantioenriched phosphinate 
esters, known precursors to chiral phosphine 
oxides (Fig. 4B) (20). This three-step route to 
phosphinate esters was applied to the synthe- 
sis of (+)-SMT022332, a utrophin modulator 
developed as a potential treatment for Duchenne 
muscular dystrophy (36-38). An analog of 
(+)-SMT022332 was previously accessed in 
83% ee and 5% overall yield from 9 using a 
chiral auxiliary-based approach (4). Subjection 
of phosphonic dichloride 9 to the optimized 
conditions for the enantioselective substitu- 
tion yielded phosphonamidate 10, which was 
characterized crystallographically (Fig. 4C). 
Subsequent methanolysis and phenol displace- 
ment furnished (+)-SMT022332 (12) in 94% ee 
and 43% overall yield over three steps. 
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A Scope of aryl phosphonic dichlorides 
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Fig. 3. Scope of enantioselective addition of diisoamylamine to aryl phos- 
phonic dichlorides and stereospecific elaborations. All yield values corre- 
spond to chromatographically purified, isolated products. Concentration values 
correspond to the initial concentration of the limiting stoichiometric reagent. 
(A) Substrate scope of addition of diisoamylamine to aryl phosphonic dichlorides 
catalyzed by la. Reactions were carried out on a 0.2-mmol scale. The absolute 
stereochemistry of the products was assigned on the basis of the x-ray 

crystal structure of 10 and the known optical rotation of 8a (Fig. 4; see supplementary 
materials). (B) Scope of nucleophiles for enantiospecific substitution with 3. 
(C) Enantiospecific displacement of the diisoamylamino group with alcohols. See 
supplementary materials for reaction conditions. rt, room temperature. 
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95% yield, 92% ee 


(D) Gram-scale synthesis of 5d. Prices from Thermo Fisher Scientific (February 
2022). The symbols in the figure indicate reaction carried out under the following 
conditions: *, at -78°C with 20 mol % catalyst loading; +, at -40°C with 

4.5 equiv of diisoamylamine; ¢, in a two-pot procedure involving generation 

of 3 in solution and purification by filtration through silica and subsequent reaction 
with 2 equiv of nucleophile; §, using one-pot procedure without purification 

of 3 with 5 equiv of nucleophile; §], in a two-pot procedure involving generation of 3 
in solution and purification by filtration through silica and subsequent reaction 
with 5 equiv of nucleophile; #, on a 0.9-mmol scale; ** on a 1.0-mmol scale; +t, on a 
0.57-mmol scale; +4, on 0.24-mmol scale; §§, run at 0.3 M concentration instead of 
0.2 M; ##, H3PO3 was used instead of para-tolylsulfonic acid. 
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A Phosphonylation of complex alcohols 
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E Application to the enantioselective formal synthesis of a matrix metalloproteinase inhibitor 
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Fig. 4. Application to the synthesis of chiral P(V) targets. All yield values 
refer to chromatographically purified, isolated products. Concentration 

values correspond to the initial concentration of the limiting stoichiometric 
reagent. (A) Stereospecific phosphonylation of precious alcohols with 6d. 
Reactions were carried out on a 0.1-mmol scale. (B) Stereospecific addition of 
Grignard reagents to 6b for the synthesis of enantioenriched phosphinate esters. 
Absolute stereochemistry of 8a was determined by comparison of optical 
rotation to literature value; others were assigned by analogy. Reactions run on a 
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0.05- to 0.1-mmol scale. The single-asterisk symbol indicates product prepared 
from 6b that was 92% ee. (C) Application of method to the enantioselective 
synthesis of (+)-SMT022332. Yield values refer to isolated yields. Absolute 
stereochemistry of 10 assigned by the depicted x-ray crystal structure, and of 12 
by comparison of the optical rotation to the literature value. (D) Orthogonally 
N-protected chlorophosphonamidate. (E) Formal synthesis of a matrix 
metalloproteinase inhibitor. dr, diastereomeric ratio; ds, diastereospecificity; 
TFA, trifluoroacetic acid. 
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In addition to serving as versatile synthetic 
building blocks, phosphonamidates are often 
synthetic targets themselves (2, 3, 5, 6, 39-43), 
and general access to these compounds by the 
catalytic procedure would be desirable. Howey- 
er, the structural requirements on the amine for 
achieving high enantioselectivity in the catalytic 
reaction impose restrictions to the N-substituents 
that can be introduced directly (Fig. 2B). We 
therefore sought to identify amine derivatives 
that participate successfully in the enantiose- 
lective reaction while bearing orthogonally 
cleavable N-protecting groups that might pro- 
vide centralized access to a variety of substituted 
phosphonamidates (Fig. 4D). High enantioselec- 
tivity was obtained using N-allyl benzylamine 
in the substitution reaction under modified 
conditions. The benzyl group and the allyl group 
on the chlorophosphonamidate products can 
each be cleaved successively, enabling their 
sequential replacement (see supplementary 
materials) (44-48). This strategy was exploited 
in the synthesis of phosphonamidate 17, a 
matrix metalloproteinase (MMP) inhibitor 
with demonstrated anticancer activity (Fig. 
4E) (2). Phosphonic dichloride 2h effectively 
underwent the catalytic reaction with N-allyl 
benzylamine to produce, after quenching with 
allyl alkoxide, phosphonamidate 13 in 89% 
ee and 88% yield. Phosphonamidate 13 was 
elaborated over three steps to afford cyclic 
phosphonamidate 16 in 90% ee, completing 
the enantioselective formal synthesis of MMP 
inhibitor 17. We anticipate that N-allyl benzyl- 
amine’s versatility as a masked “-NH,” equiv- 
alent may enable access to a wide variety of 
phosphonamidate targets. 

Moreover, we expect the versatile enan- 
tioenriched chlorophosphonamidate inter- 
mediates accessed by means of the synthetic 
strategies outlined herein to enable the facile 
synthesis of both known and new stereogenic- 
at-P(V) compounds of interest. 


Forbes et al., Science 376, 1230-1236 (2022) 


REFERENCES AND NOTES 


20, 
. K. Kuwabara, Y. Maekawa, M. Minoura, T. Maruyama, T. Murai, 


10 June 2022 


U. Pradere, E. C. Garnier-Amblard, S. J. Coats, F. Amblard, 

R. F. Schinazi, Chem. Rev. 114, 9154-9218 (2014). 

M. D. Sorensen et al., Bioorg. Med. Chem. 11, 5461-5484 
(2003). 

M. Sawa et al., J. Med. Chem. 45, 919-929 (2002). 

A. Babbs et al., Tetrahedron 76, 130819 (2020). 

A. Nocentini et al., J. Med. Chem. 63, 5185-5200 (2020). 

A. Nocentini, P. Gratteri, C. T. Supuran, Chemistry 25, 
1188-1192 (2019). 

W. A. Lee et al., Antimicrob. Agents Chemother. 49, 1898-1906 
(2005). 
T. Imamoto, Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 97, 
520-542 (2021). 

M. Dutartre, J. Bayardon, S. Jugé, Chem. Soc. Rev. 45, 
5771-5794 (2016). 

O. |. Kolodiazhnyi, Symmetry 13, 889 (2021). 10.3390/ 
sym13050889 
O. |. Kolodiazhnyi, A. Kolodiazhna, Tetrahedron Asymmetry 28, 
1651-1674 (2017). 

O. |. Kolodiazhnyi, Tetrahedron Asymmetry 23, 1-46 (2012). 
X. Ye, L. Peng, X. Bao, C.-H. Tan, H. Wang, Green Synth. Catal. 
2, 6-18 (2021). 
S. Jugé, J. P. Genet, Tetrahedron Lett. 30, 2783-2786 (1989). 
S. Jugé, M. Stephan, J. A. Laffitte, J. P. Genet, Tetrahedron 
Lett. 31, 6357-6360 (1990). 


. T. Koizumi, R. Yanada(nee Ishizaka), H. Takagi, H. Hirai, 


E. Yoshii, Tetrahedron Lett. 22, 571-572 (1981). 


. Z. S. Han et al., J. Am. Chem. Soc. 135, 2474-2477 (2013). 
. E. J. Corey, Z. Chen, G. J. Tanoury, J. Am. Chem. Soc. 115, 


11000-11001 (1993). 
K. W. Knouse et al., Science 361, 1234-1238 (2018). 
D. Xu et al., J. Am. Chem. Soc. 142, 5785-5792 (2020). 


J. Org. Chem. 85, 14446-14455 (2020). 


. A. Mondal, N. O. Thiel, R. Dorel, B. L. Feringa, Nat. Catal. 5, 


10-19 (2022). 


. D. A. DiRocco et al., Science 356, 426-430 (2017). 


A. L. Featherston et al., Science 371, 702-707 (2021). 


. M. Formica et al., ChemRxiv [Preprint] (2021). https://doi.org/ 


10.26434/chemrxiv-2021-5714s-v2. 


. C. Bauduin, D. Moulin, B. Kaloun, C. Darcel, S. Jugé, J. Org. 


Chem. 68, 4293-4301 (2003). 


. T. Kimura, T. Murai, Chem. Commun. 2005, 4077-4079 (2005). 


T. Kimura, T. Murai, Chem. Lett. 33, 878-879 (2004). 


. A. G. Doyle, E. N. Jacobsen, Chem. Rev. 107, 5713-5743 (2007). 


D. A. Kutateladze, D. A. Strassfeld, E. N. Jacobsen, J. Am. 
Chem. Soc. 142, 6951-6956 (2020). 


. A. J. Bendelsmith, S. C. Kim, M. Wasa, S. P. Roche, 


E. N. Jacobsen, J. Am. Chem. Soc. 141, 11414-11419 (2019). 


. D. D. Ford, D. Lehnherr, C. R. Kennedy, E. N. Jacobsen, ACS 


Catal, 6, 4616-4620 (2016). 


. K. L. Tan, E. N. Jacobsen, Angew. Chem. Int. Ed. 46, 1315-1317 


(2007). 
H. Xu, S. J. Zuend, M. G. Woll, Y. Tao, E. N. Jacobsen, Science 
327, 986-990 (2010). 


a7. 
38 


. D. A. Strassfeld, E. N. Jacobsen, in Supramolecular Catalysis: 
New Directions and Developments, P. W. N. M. van Leeuwen, 
M. Raynal, Eds. (Wiley, 2022), chap. 25, pp. 361-385. 

. |. V. L. Wilkinson et al., Angew. Chem. Int. Ed. 59, 2420-2428 
(2020). 

. A. Babbs et al., J. Med. Chem. 63, 7880-7891 (2020). 

. M. Chatzopoulou et al., ACS Med. Chem. Lett. 11, 2421-2427 

(2020). 

M. Van Overtveldt, T. S. A. Heugebaert, |. Verstraeten, 

D. Geelen, C. V. Stevens, Org. Biomol. Chem. 13, 5260-5264 

(2015). 

. M. Buti, M. Riveiro-Barciela, R. Esteban, J. Infect. Dis. 216 
(suppl. 8), S792-S796 (2017). 


. M. Slusarczyk, M. Serpi, F. Pertusati, Antivir. Chem. Chemother. 


26, 2040206618775243 (2018). 

. M.-A. Kasper et al., Angew. Chem. Int. Ed. 58, 11625-11630 (2019). 

_N. A. Lentini, B. J. Foust, C. C. Hsiao, A. J. Wiemer, 
D. F. Wiemer, J. Med. Chem. 61, 8658-8669 (2018). 

. Z. Zhao, Q. Zhu, S. Che, Z. Luo, Y. Lian, Synth. Commun. 50, 
2338-2346 (2020). 

. J. Xiao et al., Tetrahedron 74, 4558-4568 (2018). 

. Y. Xu, Q. Su, W. Dong, Z. Peng, D. An, Tetrahedron 73, 
4602-4609 (2017). 

. L. Zhong et al., Asian J. Org. Chem. 6, 1072-1079 (2017). 

. Q. Zhu, S. Che, Z. Luo, Z. Zhao, Synth. Commun. 50, 947-957 
(2020). 


ACKNOWLEDGMENTS 


We thank S.-L. Zheng (Harvard University) for determination of 
the x-ray crystal structure and R. Algera, J. Essman, and 


H. 
by 


Sharma for helpful discussions. Funding: Funding was provided 
National Institutes of Health grant GM043214 (E.N.J.). Author 


contributions: Both authors conceived of the work. K.C.F. 
designed and conducted the experiments. E.N.J. directed the 
research. Both authors wrote the manuscript. Competing 
interests: The authors declare no competing financial interests. 
Data and materials availability: Crystallographic data for 
compound 10 are available free of charge from the Cambridge 
Crystallographic Data Centre under reference CCDC 2155524. All 
other data are available in the main text or the supplementary 
materials. License information: Copyright © 2022 the authors, 
some rights reserved; exclusive licensee American Association 
for the Advancement of Science. No claim to original US 
government works. https://www.science.org/about/science- 
licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


sci 


ence.org/doi/10.1126/science.abp8488 


Materials and Methods 


Su 


pplementary Text 


Figs. Sl and S2 
Tables S1 to S11 
References (49-56) 


Su 
10. 


bmitted 1 March 2022; accepted 25 April 2022 
1126/science.abp8488 


6 of 6 


WORKING LIFE 


By Christina Petlowany 


1238 


Crafty like an engineer 


y classmates were certain we needed to use steel. We were designing a wheelchair for a college 
engineering course and they felt only steel would be strong enough for the handheld levers 
that would allow the user to propel the chair with a rowing motion. I wasn’t so sure. Based on 
my experience making sculptures with soda cans and creating jewelry with wire, I believed 
steel would be too heavy and aluminum would be a better option. But the student who most 
strongly advocated for steel worked at a bike shop; surely I didn’t know better, having used 
metal only for crafts. A few days later, when the hefty, overbuilt steel arm kept flopping down, I felt 
validated. I had been right—and I wished I had shown more steely resolve in defending my position. 


I was a crafty kid. Not crafty like a 
fox, but crafty to the point that my 
parents would come home braced 
for whatever “artistic” explosion I 
had unleashed that day—origami, 
painting, clay sculpting, sewing 
stuffed animals and clothes, and 
more. But when I enrolled in en- 
gineering in college, I put these 
pursuits aside. Not only was I 
stretched for time, but I didn’t 
think they were relevant to my 
academic work—and I hesitated to 
highlight my “feminine” crafting 
interests in the male-dominated 
engineering environment where 
I already felt like an outsider. I 
told myself that engineering ad- 
equately fed my creative side and 
I didn’t need the hobby. 

The wheelchair project was a 
hint that my crafting might be im- 
portant and relevant, but for the 
next few years I continued to avoid bringing it up in pro- 
fessional spaces. When I was interviewing for engineer- 
ing jobs after finishing my master’s degree and was asked 
whether I tinkered in my spare time, for example, I was 
sure the panelists wouldn’t care about my elaborate home- 
made holiday cards, even though they featured lever ac- 
tion and moving parts. Instead, I muttered about wanting 
to do more 3D printing. The company extended an offer, 
so I felt my assumption was confirmed. 

My attitude didn’t change when I went on to pursue 
a Ph.D.—until early in the pandemic, when I felt restless 
and turned to crafting as an outlet. I was making a set 
of Dungeons & Dragons dice, shimmery blue and purple 
swirled with gold flakes, as a gift for a friend. While pipet- 
ting the liquid resin into the silicone mold, I made an off- 
hand joke to my partner that I was “injection molding’”—a 
standard engineering manufacturing process. I suddenly 


“Maybe my crafting was 
something | should embrace 
rather than hide.” 


realized that although resin art is 
not injection molding in the tech- 
nical sense, it shares the spirit and 
probably some skills. Maybe my 
crafting was something I should 
embrace rather than hide. 

Soon I was seeing more ex- 
amples of connections between 
engineering and craft that I had 
previously overlooked. When 
working on the wheelchair proj- 
ect, I put my sewing skills to use 
creating cushioned grips for the 
handles. The engineering “design 
kitchen” where my undergrad 
classmates and I tested our ideas 
was stocked with inexpensive 
tools including felt, pipe clean- 
ers, and popsicle sticks—materials 
that would not be out of place in a 
craft bin, I now realized. I saw how 
crafting taught me to persevere 
when my product didn’t match my 
initial vision and to consider the failed creation a learning 
and prototyping experience, just as an engineer must. 

Since then, I’ve built crafting back into my free time. I’ve 
also stopped hiding it from my colleagues. I mentioned 
my dicemaking escapades at a robotics conference and 
broached in a team meeting how we could gain inspiration 
from an interactive art experience I had recently visited. 
The responses were consistently positive and constructive— 
not dismissive or insulting, as I used to fear. 

I’ve grown from a girl who created a makeshift vulpine 
friend by attaching legs to a stuffed sock and coloring it 
with red Sharpie to an engineer with valuable skills from 
my first passion. Perhaps I am crafty like a fox. I am also 
crafty like an engineer. 
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